Comments in Stata and Mata: Do file editor vs command prompt - comments

I typically work in another text editor and simply copy and paste my work into Stata's command prompt. However, I have noticed a difference between the way the command prompt and the do file editor handle comments.
The code below reproduces the things I have discovered:
mata
//test comment
/* test comment 2 */
end
//test comment 3
*test comment 4
/* test comment 5*/
When run from the do file editor, the code runs without issue.
But when I run it after copying and pasting into the command prompt, I receive a number of r(3000) errors in mata and r(199) errors in Stata.
The sole exception is that the * comments in regular Stata work fine in both interfaces.
I also see that the // comment in mata gives an "expression invalid" error message along with the r(3000) notification, but I only receive the r(3000) message when I use the /* text */ comment. In regular Stata, both comment types that are not * give "/ is not a valid command name" messages along with the r(199).
My main question is:
What is the reason behind this difference? Is there anything I can do to suppress these errors?
Also, this is something like a red flag for me:
Are there other behaviors that differ when I run things via the command prompt rather than the do file editor?

The following Technical Note from the 16th Stata manual about Do-files explains:
"...The /* */, //, and /// comment indicators can be used in do-files and ado-files only; you may not use them interactively. You can, however, use the ‘*’ comment indicator interactively..."
So there is nothing surprising here. You can easily prevent errors like these by following the conventions. Just read the relevant section of the aforementioned manual for more details.
Only StataCorp knows for sure, but such differences probably arise from how Stata interprets the code internally when this is parsed from a do file or the command prompt.
See the following post for another (unrelated) example of an inconsistent behaviour:
Stata axis labels off-center when broken over multiple lines
Personally, after using Stata extensively for years, i have not noticed any other major differences when running code from do files and interactively.

Related

wrap multi-line comments in vs code

VS Code seems to understand what it means to wrap multi-line comments differently from what other IDEs understand. For instance if I set Intellij to wrap multi-line comments at column 100, it breaks the line for me. But if I ask VS Code to do the same, it visually wraps the line but if I later open the same file in a simple text editor, I get one long line.
How do I get VS Code to auto break long comment lines?
Intellij:
/**
* Intellij will break this line
* at the correct location
* because I asked it to break
*/
VS Code
/**
* VS Code will wrap this line
at the correct location
but it's really one very
long line.
*/
I don't think VS Code provides native support for this.
There is an extension in the marketplace for this though: https://marketplace.visualstudio.com/items?itemName=stkb.rewrap.

SPSS: Create links/anchors in syntax

Is there a way to create links or anchors within SPSS syntax? Something like linking to a bookmark.
I am making changes and additions to a syntax file, and document these changes at the bottom of the file as comments. In these comments I would like to link to the part of the syntax that was changed. Now I just write the line number, but that changes as I add more syntax, so the reference becomes incorrect.
Bookmarks were the closest thing I found to what I want to do, but I can't turn them into a link. Moreover, I can only create a maximum of 9 bookmarks, which is not enough.
Trying to think creatively here:
instead of bookmarking all the changes, you could break up your syntax into many small syntaxes - each of which contains one of the parts where a change was made.
you can name and number the small syntaxes accordingly.
Then you create one syntax which contains a series of INSERT commands, which calls each of the small syntaxes in turn. You can add titles and remarks between the insert commands, so other users can follow the process and study the relevant small syntax that they need separately.
The Statistics Syntax Editor supports bookmarks - you can have up to 10. Generate a few in the SE and save the syntax file to see how these are represented (hint: look at the COMMENT BOOKMARK lines.

How to find foreign language used in "C comments"

I have a large source code where most of the documentation and source code comments are in english. But one of the minor contributors wrote comments in a different language, spread in various places.
Is there a simple trick that will let me find them ? I imagine first a way to extract all comments from the code and generate a single text file (with possible source file / line number info), then pipe this through some language detection app.
If that matters, I'm on Linux and the current compiler on this project is CLang.
The only thing that comes to mind is to go through all of the code manually and check it yourself. If it's a similar language, that doesn't contain foreign letters, consider using something with a spellchecker. This way, the text that isn't recognized will get underlined, and easy to spot.
Other than that, I don't see an easy way to go through with this.
You could make a program, that reads the files and only prints the comments out to another output file, where you then spell check that file, but this would seem to be a waste of time, as you would easily be able to spot the comments yourself.
If you do make a program for that, however, keep in mind that there are three things to check for:
If comment starts with /*, make sure it stops reading when encountering */
If comment starts with //, only read one line - unless:
If line starting with // ends with \, read next line as well
While it is possible to detect a language from a string automatically, you need way more words than fit in a usual comment to do so.
Solution: Use your own eyes and your own brain...

Rstudio difference between run and source

I am using Rstudio and not sure how options "run" and "source" are different.
I tried googling these terms but 'source' is a very common word and wasn't able to get good search results :(
Run and source have subtly different meanings. According to the RStudio documentation,
The difference between running lines from a selection and invoking
Source is that when running a selection all lines are inserted
directly into the console whereas for Source the file is saved to a
temporary location and then sourced into the console from there
(thereby creating less clutter in the console).
Something to be aware of, is that sourcing functions in files makes them available for scripts to use. What does this mean? Imagine you are trying to troubleshoot a function that is called from a script. You need to source the file containing the function, to make the changes available in the function be used when that line in the script is then run.
A further aspect of this is that you can source functions from your scripts. I use this code to automatically source all of the functions in a directory, which makes it easy to run a long script with a single run:
# source our functions
code.dir <- "c:\temp"
code.files = dir(code.dir, pattern = "[.r]")
for (file in code.files){
source(file = file.path(code.dir,file))
}
Sometimes, for reasons I don't understand, you will get different behavior depending on whether you select all the lines of code and press the run the button or go to code menu and chose 'source.' For example, in one specific case, writing a gplot to a png file worked when I selected all my lines of code but the write failed to when I went to the code menu and chose 'source.' However, if I choose 'Source with Echo,' I'm able to print to a png file again.
I'm simply reporting a difference here that I've seen between the selecting and running all your lines and code and going to code menu and choosing 'source,' at least in the case when trying to print a gplot to a png file.
An important implication of #AndyClifton's answer is:
Rstudio breakpoints work in source (Ctrl-Shift-S) but not in run (Ctrl-Enter)
Presumably the reason is that with run, the code is getting passed straight into the console with no support for a partial submission.
You can still use browser() though with run though.
print() to console is supported in debugSource (Ctrl-Shift-S) as well as run.
The "run" button simply executes the selected line or lines. The "source" button will execute the entire active document. But why not just try them and see the difference?
I also just discovered that the encoding used to read the function sourced can also be different if you source the file or if you add the function of the source file to your environment with Ctrl+Enter!
In my case there was a regex with a special character (µ) in my function. When I imported the function directly (Ctrl+Enter) everything would work, while I had an error when sourcing the file containing this function.
To solve this issue I specified the encoding of the sourced file in the source function (source("utils.R", encoding = "UTF-8")).
Run will run each line of code, which means that it hits enter at the beginning of each line, which prints the output to the console. Source won't print anything unless you source with echo, which means that ggplot won't print to pngs, as another posted mentioned.
A big practical difference between run and source is that if you get an unaccounted for error in source it'll break you out of the code without finishing, whereas run will just pass the next line to the console and keep going. This has been the main practical difference I've seen working on cleaning up other people's scripts.
When using RSTudio u can press the run button in the script section - it will run the selected line.
Next to it you have the re - run button, to run the line again. and the source button next to it will run entire chuncks of code.
I found a video about this topic:
http://www.youtube.com/watch?v=5YmcEYTSN7k
Source/Source with echo is used to execute the whole file whereas Run as far as my personal experience goes executes the line in which your cursor is present.
Thus, Run helps you to debug your code. Watch out for the environment. It will display what's happening in the stack.
To those saying plots do not show. They won't show in Plots console. But you can definitely save the plot to disc using Source in RStudio. Using this snippet:
png(filename)
print(p)
dev.off()
I can confirm plots are written to disc. Furthermore print statements are also outputted to the console

Do standard windows .ini files allow comments?

Are comments allowed in Windows ini files? (...assuming you're using the GetPrivateProfileString api functions to read them...)
[Section]
Name=Value ; comment
; full line comment
And, is there a proper spec of the .INI file format anywhere?
Thanks for the replies - However maybe I wasn't clear enough. It's only the format as read by Windows API Calls that I'm interested in. I know other implementations allow comments, but it's specifically the MS Windows spec and implementation that I need to know about.
Windows INI API support for:
Line comments: yes, using semi-colon ;
Trailing comments: No
The authoritative source is the Windows API function that reads values out of INI files
GetPrivateProfileString
Retrieves a string from the specified section in an initialization file.
The reason "full line comments" work is because the requested value does not exist. For example, when parsing the following ini file contents:
[Application]
UseLiveData=1
;coke=zero
pepsi=diet ;gag
#stackoverflow=splotchy
Reading the values:
UseLiveData: 1
coke: not present
;coke: not present
pepsi: diet ;gag
stackoverflow: not present
#stackoverflow: splotchy
Update: I used to think that the number sign (#) was a pseudo line-comment character. The reason using leading # works to hide stackoverflow is because the name stackoverflow no longer exists. And it turns out that semi-colon (;) is a line-comment.
But there is no support for trailing comments.
I have seen comments in INI files, so yes. Please refer to this Wikipedia article. I could not find an official specification, but that is the correct syntax for comments, as many game INI files had this as I remember.
Edit
The API returns the Value and the Comment (forgot to mention this in my reply), just construct and example INI file and call the API on this (with comments) and you can see how this is returned.
USE A SEMI-COLON AT BEGINING OF LINE --->> ; <<---
Ex.
; last modified 1 April 2001 by John Doe
[owner]
name=John Doe
organization=Acme Widgets Inc.
I like the analysis of #Ian Boyd, because it is based on the official GetPrivateProfileString() method of Microsoft.
In my attempts of writing a Microsoft compatible INI parser, I'm having a closer look at the said Microsoft API and for comments I found out:
you can have line comments using semicolon
the semicolon needn't be the first character of the line; it can be preceded by space, tab or vertical tab
you can have trailing "comments" after a section even without semicolon. It's probably not intended to be a comment, but the parser will ignore it.
values outside a section cannot be accessed (at least I did not find a way), effectively making them useless except for commenting purposes
certainly abuse, but the parser overflows at 65536 characters, so anything after that will not be part of the value either. I would not rely on this, since Microsoft could fix this in later versions of Windows. Also, it's not very useful as a comment when you don't see it.
Example:
this=cannot be accessed
[section]this=is ignored
;this=is a line comment
;this=is a comment preceded by spaces
key=value <... 65530 spaces ...>this=cannot be parsed
Yes, it allows.
The way to comment is to use ; for a new line rather than just after the content you want to comment in the same line, which is allowable for other files where you want to comment.
Let me show you an example:
I use .ini file to pass some parameters for my training file when I use SUMO software. If I write like this:
width_layers = 400 ;the number of neurons per layer in the neural network.
I will get an error message which is
ValueError: invalid literal for int() with base 10: '400 ;the number of neurons per layer in the neural network.'
I have to create a line for that, which is
width_layers = 400
;the number of neurons per layer in the neural network.
Then, it will work. Hope it helps in detail!

Resources