Please consider the following problem.
I'm writing a quick Manipulate[] program to display a ton of information, but am running into problem with the unicode. Here is what I currently have as input and output:
Manipulate[
request = filenumber <> "*";
filenames = FileNames[request];
display = Import[type, "List"];
Short[display, 25]
, {filenumber, "001", InputField}, {type, filenames, PopupMenu}]
The problem is that the French-language accents are showing up oddly. The quick workaround I thought of was to change my code to Import[type,"Plaintext"]; which works, but then displays the information in list form, like so:
What would you suggest as a way to get the clarity of the second example with the straightforward list format of the former? So that it wraps on the line rather than having a line break after each entry.
As an aside - probably just as important as the actual question itself - could anybody explain the rationale behind why importing as a "List" distorts the unicode? I've had a lot of trouble working around this, and understanding the underlying behaviour might help me move forward quicker.
Although Import does not have options associated with itself, it takes options relevant to the format being imported. Specifically see the Options section of ref/Format/List for the list of options.
In the case at hand, you can indicate the file encoding with CharacterEncoding->"UTF8":
Import[filename, "List", CharacterEncoding -> "UTF8"]
Related
I'm an ABAP programmer and I was asked to make a minor modification to an IPL label.
Easily done, but now I was tasked to fix a long running error within said label.
I know nothing about IPL and the lack of a online viewer makes everything worse...
The problem is that "tabulation" right in the middle of a text (I underlined it in blue on the Label's pic).
I checked the code and there's nothing there that should make that tabulation appear.
I spent a whole month reading manuals and trying to fix it, but nothing changes...
Here's the code and the resulting label:
<STX>R<ETX>
<STX><ESC>C<SI>W791<SI>h<ETX>
<STX><ESC>P<ETX>
<STX>F*<ETX>
<STX>H1;f3;o220,52;c34;b0;h2;w1;d3,300052947-FANDANGOS PRESUNTO 140GX14 LD<ETX>
<STX>H2;f3;o130,52;c33;b0;h1;w1;d3,Val:<ETX>
<STX>H3;f3;o130,204;c34;b0;h1;w1;d3,QTD.Unidade:<ETX>
<STX>H4;f3;o90,33;c34;b0;h0;w1;d3,16/08/21<ETX>
<STX>H5;f3;o90,302;c34;b0;h1;w1;d3,14<ETX>
<STX>B6;f3;o375,44;c2,0;w6;h102;r0;d3,17892840816329<ETX>
<STX>H7;f3;o275,44;c26;b0;h17;w17;d3,17892840816329<ETX>
<STX>H8;f3;o130,490;c34;b0;h0;w1;d3,Lote:<ETX>
<STX>B9;f3;o090,600;c2,0;w2;h45;r0;d3,0005218177<ETX>
<STX>H10;f3;o130,600;c34;b0;h0;w1;d3,0005218177<ETX>
<STX>D0<ETX>
<STX>R<ETX>
<STX><SI>l13<ETX>
<STX><ESC>E*,1<CAN><ETX>
<STX><RS>1000<US>1<ETB><ETX>
Label
Can you guys help me, please??
Edit: Just to make it clear, I did that blue line on that image to show what's the problem.
Here are some tests I did by changing the data:
Test1
Test2
The error always appear at the same point in the label, as long as there's a space in that text.
Have you looked at the raw data of the output? Is it POSSIBLE that what looks like a space is actually some special character that is making IPL choke blue? Because it is literally the 1 character between the "O" and "1". For grins, you might also try to change the character in the data to a "-" just for purposes of confirming data context. It might even just be a TAB character.
I have done IPL years ago and have actually gone to the point of defining a pre-defined label template and generating output that says to use template X (whatever # I created as),and pass the data along that fills into the respective fields.
A final option I would throw in is this. Take the sample output you have and just force sample data into each of the output areas. So, instead of your literal data, put fake data in similar context just to see if it is data specific or other. Such as
<STX>H1;f3;o220,52;c34;b0;h2;w1;d3,300052947-FANDANGOS PRESUNTO 140GX14 LD<ETX>
becomes
<STX>H1;f3;o220,52;c34;b0;h2;w1;d3,123456789-TESTING-SAMPLEDATA-123XY12-AB<ETX>
Notice same context of data, but no spaces and using dash "-" just for testing. Is there something special about the actual data. This is a good way I have done historically for similar strangeness early on doing IPL labels.
User decided to not spend anymore time on this issue, so now I'm unable to further test the label.
Unfortunately this problem will go unsolved for now. Hope I get another chance to fix this and learn more about IPL.
Thanks you so much for your answers!
One thing that constantly annoys me about VS is that when I do a Find or Find all, it looks in comments, strings, and other places. When I'm trying to find a particular bit of code, like and rent, it finds it all over. Is there a way to limit searches just to code?
Not sure if there is a specific setting to ignore comments, but you could do a regex find. For example, assuming you want to find "text", you could use this:
^(?!\s*?//).*?text
Caveats:
Assumes comments start with // as first non-whitespace characters. E.g. C# comment types
Doesn't work for comments at the end of code lines (only comments on their own lines)
Doesn't work with block comments, for example /* comment */
So overall it isn't perfect by any means, but depending how many hits you are getting, it might help to cut them down which can be useful if you have a lot of false positives in one-liner comments
The 'Find All References' function may suit you : it ignores all commented-out code and text in strings. CTRL+K, R is the keyboard shortcut.
(Note that it's designed for going from a specific instance of a search string to all other instances. so if you haven't already found an instance of what you're searching for, you would have to (temporarily) type one in to the editor window, then search. Also it's not available for all languages : I know it works fine for C#, though.)
I have a large source code where most of the documentation and source code comments are in english. But one of the minor contributors wrote comments in a different language, spread in various places.
Is there a simple trick that will let me find them ? I imagine first a way to extract all comments from the code and generate a single text file (with possible source file / line number info), then pipe this through some language detection app.
If that matters, I'm on Linux and the current compiler on this project is CLang.
The only thing that comes to mind is to go through all of the code manually and check it yourself. If it's a similar language, that doesn't contain foreign letters, consider using something with a spellchecker. This way, the text that isn't recognized will get underlined, and easy to spot.
Other than that, I don't see an easy way to go through with this.
You could make a program, that reads the files and only prints the comments out to another output file, where you then spell check that file, but this would seem to be a waste of time, as you would easily be able to spot the comments yourself.
If you do make a program for that, however, keep in mind that there are three things to check for:
If comment starts with /*, make sure it stops reading when encountering */
If comment starts with //, only read one line - unless:
If line starting with // ends with \, read next line as well
While it is possible to detect a language from a string automatically, you need way more words than fit in a usual comment to do so.
Solution: Use your own eyes and your own brain...
I am getting data from a broken RSS feed that gives me wrong link. I wanted to fix this link so I made this code:
<link.*>(.*)&.*tid(.*)</link>
and the link could be like:
www.somedomain.com/?value=50&burrrdurrrr;tid=120
But the real working link is in this form:
www.somedomain.com/?value=50&tid=120
The thing that I'm asking is if my measure thing looks like this:
[FeedURL]
Measure=Plugin
Plugin=Plugins\WebParser.dll
Url=[Feed]
StringIndex=2 ;now I only get www.somedomain.com/?value=50
Substitute=#SubstituteFeed#
How am I supposed to concatenate the strings together to complete the url?
I'm guessing rather than &burrrdurrrr;, the link has &, which is how you have to write & in an HTML or XML file.
If that's the case, you just need to set the DecodeCharacterReference option, as described in this handy-looking tutorial. Another option mentioned there is Substitute, which would be able to strip it out even if it really was &burrrdurrrr;.
None of this is a particularly sensible way of dealing with HTML or XML - a much better approach would be a plugin which actually parsed the document structure and let you reference nodes using XPath or CSS rules - but you work with what you've got, I guess. (I've never heard of this "Rainmeter" before, despite its claim to be "the best known and most popular desktop customization program for Windows"; maybe because nobody else calls their program that, instead almost universally using the word "widget"?)
So I have been trying to figure out how to add syntax highlighting for the name of typedef's in c++ files, in sublime text.
For example, if I have typedef long long integer; I want integer to be highlighted (preferably the same color as the other types: int, bool, etc.). I went looked at the C.tmLanuage file, and tried to add the following regex code ^typedef.*?\s(\w+)\s*; to storage.type.c (line 49), but it didn't work. If I add the word string, it will highlight all instances of the word string. I tried going in the C++.tmLanguage file, and adding the regex code to storage.type.c++, but it still did not work.
Does anybody know how to get typedef's highlighted in sublime text?
Also, is there a way to get syntax highlighting for class name? Let's say I declare a string or vector, I would like either string or vector to be highlighted.
That regex would work (I believe) if you had something along the lines of typedef foo; To get the behavior you want, you will have to create a slightly more complex pattern entry in the tmLanguage file. As the language file is based on TextMates, you will want to have this as a reference (http://manual.macromates.com/en/language_grammars#language_grammars). I would also recommend using PlistJsonConverter (working in JSON is easier for me than working in XML). You will probably need to define begin and end patterns (begin will probably be typedef end will probably be ;). You can then apply whatever patterns you want to that group.
As for the class name highlighting, I would look to see what, if any scopes are being applied. If none are, you will have to come up with a regex to apply the scope to those. You can then add a color entry, or use a defined one from the color scheme.
Edit:
Actually they don't appear to be JSON. I see () rather than []. JSON is pretty simple to understand. You can look for something more in depth, but wikipedia is a good place to start. What you would probably be interested in are the things under the "Rule Keys" section. I did some searching (because I knew there were some better examples out there), and came across http://docs.sublimetext.info/en/latest/extensibility/syntaxdefs.html . It goes over syntax definitions from scratch, but the most relevant section is probably http://docs.sublimetext.info/en/latest/extensibility/syntaxdefs.html#analyzing-patterns. I don't have a regex to find class names, so you would have to come up with one yourself. If you haven't already though, you may want to search around to see if someone else has implemented a language file in a way that works for you.
You will want to start with the built in tmLanguage file and convert that from a Plist to json. You can then edit that file and move it back.