Format PART of a specific word in InDesign - adobe-indesign

There's a unique, made-up word in a book I am editing. I need to italicize the first three letters of this word every time it occurs.
So far I have determined that GREP styles are my best shot at automatically formatting this word, but I have not been able to create a GREP string that works. Any help would be welcome!
Edit:
I managed to get a working GREP query, but this only works for me in the Find/Change dialog. I believe that these GREP strings need to be written a little differently depending on where they are used in the program...
By the way, the specific word I am looking for is youniverse. I need you to always be italicized.
My current working Find/Change GREP query is:
you(?=niverse)
This is a basic way to get the result I am looking for. Ideally this would be a GREP Style in my main paragraph style so I could procedurally apply this style every time the word occurs

If you need to match the first "Dan" of DankFarrik use this in your GREP search and apply appropriate character style:
(Dan|DankFarrik)
You can also try this one
((Dan)(?=kFarrik))

Related

How to read out value from website in bash?

I want to read out and later process a value from a website (Facebook Ads) from a bash script that runs daily. Unfortunately I need to be logged in to get this value:
So far I've figured out how to log into this website on Firefox and save the html file where the value could theoretically be read out:
The only unique identifier in this file is the first instance of "Gesamtausgaben". Is there any way with this information to cut out everything besides "100,10" ?
I'd also be happy for a different kind of way to get this value. And no, I don't have any API access.
I appreciate all ideas.
Thanks,
Patrick
How to Parse HTML (Badly) with PCRE
You can't reliably parse HTML with just regular expressions, so you'll need an XML/HTML or XPATH parser to do this properly. That said, if you have a PCRE-compatible grep then the following will likely work provided the HTML is minified and the class isn't re-used on your page.
$ pcregrep -o 'span class=".*_3df[ij].*>\K[^<]+' foo.html
100,10 €
If your target HTML spreads across multiple lines, or if you have multiple spans with the same classes assigned, then you'll have to do some work to refine the regular expression and differentiate between which matches are important to you. Context lines or subsequent matches may be helpful, but your mileage will definitely vary.

how to add a separator after each word with ghostscript -sDEVICE=txtwrite

I have used ghostscript to successfully extract text from PDFs that have tables.
This simple command works very well:
gswin64c -sDEVICE=txtwrite -o test.txt "c:\reports\sample.pdf"
However some words get joined together especially from tables, for example:
234801111111109-12-2014 16:17:04764030208117034 2883253100.00 Payment
234801111111109-12-2014 16:18:461088956908117033 2883253400.00 Payment
234801111111109-12-2014 16:19:48769948208117040 2883253750.00 Payment
should actually be:
2348011111111 09-12-2014 16:17:04 764030208117034 2883253 100.00 Payment
2348011111111 09-12-2014 16:18:46 1088956908117033 2883253 400.00 Payment
2348011111111 09-12-2014 16:19:48 769948208117040 2883253 750.00 Payment
Please is there a way to add a separator character at the end of each word.
That would solve this perfectly.
No sorry, this idea simply won't work.
There is no such thing as a 'word' in a PDF file, there is simply a sequence of character codes and positions. The txtwrite code goes to some lengths to try and reconstruct words by looking at the position of each piece of text, and the metrics of the fonts used, but there are no words in the original.
I don't claim this is perfect, if you'd like me to look at it you will need to supply the original file. Best solution is to open a bug report and attach the file to it.
This is still an area I'm looking at, for a different project (RTF output) so now is a good time to report it. I cannot guarantee being able to resolve it, but it may well simply be that the 'rebuild the page layout' code is being too simple-minded about the location of the text.
You can, however, get a lower level output, the XML-like output will give you each fragment of text individually, and its position on the page. You could use that information yourself to rebuild the content.
The default option tries to build a simple representation of the page by using space characters to reproduce the layout of the original, as far as possible, but I have no illusions that there aren't bugs :-)

Find but skip strings and comments?

One thing that constantly annoys me about VS is that when I do a Find or Find all, it looks in comments, strings, and other places. When I'm trying to find a particular bit of code, like and rent, it finds it all over. Is there a way to limit searches just to code?
Not sure if there is a specific setting to ignore comments, but you could do a regex find. For example, assuming you want to find "text", you could use this:
^(?!\s*?//).*?text
Caveats:
Assumes comments start with // as first non-whitespace characters. E.g. C# comment types
Doesn't work for comments at the end of code lines (only comments on their own lines)
Doesn't work with block comments, for example /* comment */
So overall it isn't perfect by any means, but depending how many hits you are getting, it might help to cut them down which can be useful if you have a lot of false positives in one-liner comments
The 'Find All References' function may suit you : it ignores all commented-out code and text in strings. CTRL+K, R is the keyboard shortcut.
(Note that it's designed for going from a specific instance of a search string to all other instances. so if you haven't already found an instance of what you're searching for, you would have to (temporarily) type one in to the editor window, then search. Also it's not available for all languages : I know it works fine for C#, though.)

How to find foreign language used in "C comments"

I have a large source code where most of the documentation and source code comments are in english. But one of the minor contributors wrote comments in a different language, spread in various places.
Is there a simple trick that will let me find them ? I imagine first a way to extract all comments from the code and generate a single text file (with possible source file / line number info), then pipe this through some language detection app.
If that matters, I'm on Linux and the current compiler on this project is CLang.
The only thing that comes to mind is to go through all of the code manually and check it yourself. If it's a similar language, that doesn't contain foreign letters, consider using something with a spellchecker. This way, the text that isn't recognized will get underlined, and easy to spot.
Other than that, I don't see an easy way to go through with this.
You could make a program, that reads the files and only prints the comments out to another output file, where you then spell check that file, but this would seem to be a waste of time, as you would easily be able to spot the comments yourself.
If you do make a program for that, however, keep in mind that there are three things to check for:
If comment starts with /*, make sure it stops reading when encountering */
If comment starts with //, only read one line - unless:
If line starting with // ends with \, read next line as well
While it is possible to detect a language from a string automatically, you need way more words than fit in a usual comment to do so.
Solution: Use your own eyes and your own brain...

Sublime Text - Exclude comments in search

Every time I search for a function inside of hundreds of files, I see so many matches for comments which have no effect in the code.
Can someone limit Sublime Text's search scope to real code, and exclude comments?
I use Sublime Text 3 for developing a C++ program.
I created a Plugin that search for a given string inside a given scope.
The default scope selector is -comment effectively searching outside of comments. The text to search for is taken from the current selection. The results are presented in the drop-down menu
Basically I combined two API methods:
view.find_all(pattern) that searches for a pattern in the given view.
view.match_selector(position, scope_selecor) that check if the given position is inside the given scope.
You could use regex to find patters matching the regex you give.
Design the regex according to match your.
You can give regex by turning on the 'Regular Expression' flag
Example
You can have this regex to match your case if you want to match alone in single line comments.
^(?!\/\/)([^\/\n]*)YOUR_SEARCH_TERM
If you want to match also in multi line comments use this.
^(?!(\/\/|(\/\*(.|\n)*([^\*])(?=\/))))YOUR_SEARCH_TERM

Resources