I have read a lot about the pain of replicate the easy robust option from STATA to R to use robust standard errors. I replicated following approaches: StackExchange and Economic Theory Blog. They work but the problem I face is, if I want to print my results using the stargazer function (this prints the .tex code for Latex files).
Here is the illustration to my problem:
reg1 <-lm(rev~id + source + listed + country , data=data2_rev)
stargazer(reg1)
This prints the R output as .tex code (non-robust SE) If i want to use robust SE, i can do it with the sandwich package as follow:
vcov <- vcovHC(reg1, "HC1")
if I now use stargazer(vcov) only the output of the vcovHC function is printed and not the regression output itself.
With the package lmtest() it is possible to print at least the estimator, but not the observations, R2, adj. R2, Residual, Residual St.Error and the F-Statistics.
lmtest::coeftest(reg1, vcov. = sandwich::vcovHC(reg1, type = 'HC1'))
This gives the following output:
t test of coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -2.54923 6.85521 -0.3719 0.710611
id 0.39634 0.12376 3.2026 0.001722 **
source 1.48164 4.20183 0.3526 0.724960
country -4.00398 4.00256 -1.0004 0.319041
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
How can I add or get an output with the following parameters as well?
Residual standard error: 17.43 on 127 degrees of freedom
Multiple R-squared: 0.09676, Adjusted R-squared: 0.07543
F-statistic: 4.535 on 3 and 127 DF, p-value: 0.00469
Did anybody face the same problem and can help me out?
How can I use robust standard errors in the lm function and apply the stargazer function?
You already calculated robust standard errors, and there's an easy way to include it in the stargazeroutput:
library("sandwich")
library("plm")
library("stargazer")
data("Produc", package = "plm")
# Regression
model <- plm(log(gsp) ~ log(pcap) + log(pc) + log(emp) + unemp,
data = Produc,
index = c("state","year"),
method="pooling")
# Adjust standard errors
cov1 <- vcovHC(model, type = "HC1")
robust_se <- sqrt(diag(cov1))
# Stargazer output (with and without RSE)
stargazer(model, model, type = "text",
se = list(NULL, robust_se))
Solution found here: https://www.jakeruss.com/cheatsheets/stargazer/#robust-standard-errors-replicating-statas-robust-option
Update I'm not so much into F-Tests. People are discussing those issues, e.g. https://stats.stackexchange.com/questions/93787/f-test-formula-under-robust-standard-error
When you follow http://www3.grips.ac.jp/~yamanota/Lecture_Note_9_Heteroskedasticity
"A heteroskedasticity-robust t statistic can be obtained by dividing an OSL estimator by its robust standard error (for zero null hypotheses). The usual F-statistic, however, is invalid. Instead, we need to use the heteroskedasticity-robust Wald statistic."
and use a Wald statistic here?
This is a fairly simple solution using coeftest:
reg1 <-lm(rev~id + source + listed + country , data=data2_rev)
cl_robust <- coeftest(reg1, vcov = vcovCL, type = "HC1", cluster = ~
country)
se_robust <- cl_robust[, 2]
stargazer(reg1, reg1, cl_robust, se = list(NULL, se_robust, NULL))
Note that I only included cl_robust in the output as a verification that the results are identical.
I downloaded the lastest Stanford POS Tagger from https://nlp.stanford.edu/software/tagger.html , then use the following code to get the pos tag of a sentence:
jar = './stanford_postagger/stanford-postagger-3.9.1.jar'
model = './stanford_postagger/models/english-left3words-distsim.tagger'
pos_tagger = StanfordPOSTagger(model, jar)
text = nltk.word_tokenize('How much did the Dow rise?')
stanford_pos = pos_tagger.tag(text)
The output for this sentence is:
[('How', 'WRB'), ('much', 'RB'), ('did', 'VBD'), ('the', 'DT'), ('Dow', 'NNP'), ('rise', 'NN'), ('?', '.')]
which is a wrong output -- interpreting the last verb as noun.
But the online parser at http://nlp.stanford.edu:8080/parser/index.jsp gives the right pos tag:
How/WRB
much/JJ
did/VBD
the/DT
dow/NN
rise/VB
?/.
could someone tell me why these two give different results?
I am working on a protocol using TeXMaker. I switched from Eclipse+Texlipse to Texmaker and what compiled successfully before, does not compile anymore.
I have a main.tex file, which contains the structure of my protocol. I have several tex-files as inputs and a design.sty, which provides my design. I want to compile and create the PDF-protocol.
When I try to execute the following code in TeXMaker (the main.tex):
\documentclass[11pt,a4paper,oneside,listof=totoc, bibliography=totoc
version=first]{scrreprt}
\usepackage{design}
\begin{document}
\pagenumbering{arabic}
% cover
\input{./cover.tex}
% introduction
\newpage
\chapter{Introduction}
\section{Synmikro}
\input{./synmicro.tex}
\section{Genetic Switches}
\input{./switches.tex}
\section{ECFs}
\input{./ECF.tex}
\section{Sinorhizobium Meliloti}
\input{./meliloti.tex}
\newpage
\section{Laboratory Internship}
\input{./internship.tex}
\section{Bioinformatics}
\input{./bioinfo.tex}
% materials and methods
\newpage
\chapter{Material and Methods}
\section{Used strains}
\input{./MMexoECFs.tex}
\section{Cultivation conditions}
\input{./MMcultivation.tex}
\section{RNA preparation}
\input{./MMrNAprep.tex}
\section{Quality control of total RNA}
\input{./MMtotalRNAQC.tex}
%\section{Quality control}
% \subsection{PCR and Agarose gel}
% \input{./MMnormPCR.tex}
%\subsection{RNA purity and integrity control}
% \input{./MMbioanalyzer.tex}
\section{qRT-PCR}
\input{./MMqRTPCR.tex}
%\section{QBit}
% \input{./MMqbit.tex}
\section{Bioinformatics}
\input{./MMbioinfo.tex}
\subsection{Non-restrictive approach}
\input{./MMnonRestrictive.tex}
\subsection{Levenshtein distance}
\input{./MMlev2.tex}
\subsection{Feature Search}
\input{./MMfeatureSearch.tex}
%\subsection{Position Specific Scoring Matrices}
% \input{./MMpssm.tex}
% results
\newpage
\chapter{Results}
%\section{Agarose Gel}
% \input{./Ragarose.tex}
\section{Nanodrop and Bioanalyzer}
\input{./Rbioanalyzer.tex}
%\section{Qbit}
% \input{./Rqubit.tex}
\newpage
\section{Real-Time PCR}
\input{./RRTpCR.tex}
\newpage
\section{Bioinformatics}
\input{./Rbioinfo.tex}
% discussion
\end{document}
TeXMaker gives me several errors for main.tex. They are:
Line 47: File ended while scanning use of \caption#xdblarg
Line 79: !Latex Error: \begin{figure} on input line 7 ended by \end{document}
Line 79: !You can't use '\end’ in internal vertical mode
Line 79: !Latex Error: \begin{figure} on input line 7 ended by \end{document}
! Missing } inserted
Line 1: ! Emergency stop. <*> main.tex ***(job aborted, no legal \end found)
Line 47 is "\input{./MMqRTPCR.tex}
Line 79 is "\end{document}"
I am honestly confused. TeXMaker's error description is about missing braces. Am I screen-blind? I checked the braces 3 times and cannot figure out, what I missed. So, I am guessing, I missed something crucial about Latex.
Thanks for any help!
UPDATE:
In the input file "MMqRTPCR.tex", I out-commented a figure and all errors are gone. Here is the content of the file.
TEXTEXTEXT
%\begin{figure}[H]
%\centering
% \includegraphics[scale=0.6,natwidth=764,natheight=218]%{deltaDeltaCorrectedFormula.png}
% \caption{mycaption}
%\label{deltaDelta}
%\end{figure}
TEXTEXTEXT
\begin{table}[!h]
\centering
\caption{Primer sequences and targets}
\label{table1}
\begin{tabular}{|l|l|l|ll}
\cline{1-3}
Primer Sequence & Target & Direction \\ \cline{1-3}
AACATGTGCCGGTTGATAG & ECF20_992 & forward \\ \cline{1-3}
GCTGCTTCGGTATTGCTCA & ECF20_992 & reverse \\ \cline{1-3}
TCGTACCATTGAAAGCCTG & ECF02_2817 & forward \\ \cline{1-3}
ATCAATGGCTTCACGTGCA & ECF02_2817 & reverse \\ \cline{1-3}
TTCAAGAAACCATGGCCAC & ECF11_987 & forward \\ \cline{1-3}
GCTCGGCCAAATATCATCG & ECF11_987 & reverse \\ \cline{1-3}
\end{tabular}
\end{table}
TEXTEXTEXT
** UPDATE & SOLUTION**
The error was not in the main.tex itself, but in an input file. So, when TeXMaker tells you about missing bracers, jump to the line, where the error occured in the main.tex and check the bracers in the input file, which you see there.
So,
the error was a missing curly brace after all! Yet, it was not missing in the main.tex, but in the input file within the figure caption!
Thanks anybody, who's brain might have melted trying to find a solution to my problem. I hope this helps others, if they encounter the same error. :)
I am using the dprint package with knitr , mainly so that I can highlight rows from a table, which I have got working, but the output image leaves a fairly large space for a footnote, and it is taking up unnecessary space.
Is there away to get rid of it?
Also since I am fairly new to dprint, if anybody has better ideas/suggestions as to how to highlight tables and make them look pretty without any footnotes... or ways to tidy up my code that would be great!
An example of the Rmd file code is below...
```{r fig.height=10, fig.width=10, dev='jpeg'}
library("dprint")
k <- data.frame(matrix(1:100, 10,10))
CBs <- style(frmt.bdy=frmt(fontfamily="HersheySans"), frmt.tbl=frmt(bty="o", lwd=1),
frmt.col=frmt(fontfamily="HersheySans", bg="khaki", fontface="bold", lwd=2, bty="_"),
frmt.grp=frmt(fontfamily="HersheySans",bg="khaki", fontface="bold"),
frmt.main=frmt(fontfamily="HersheySans", fontface="bold", fontsize=12),
frmt.ftn=frmt(fontfamily="HersheySans"),
justify="right", tbl.buf=0)
x <- dprint(~., data=k,footnote=NA, pg.dim=c(10,10), margins=c(0.2,0.2,0.2,0.2),
style=CBs, row.hl=row.hl(which(k[,1]==5), col='red'),
fit.width=TRUE, fit.height=TRUE,
showmargins=TRUE, newpage=TRUE, main="TABLE TITLE")
```
Thanks in advance!
I haven't used dprint before, but I see a couple of different things that might be causing problems:
The start of your code chunk has defined the image width and height, which dprint seems to be trying to use.
You are setting both fit.height and fit.width. I think only one of those is used (in other words, the resulting image isn't stretched to fit both height and width, but only the one that seems to make most sense, in this case, width).
After tinkering around for a minute, here's what I did that minimizes the footnote. However, I don't know if there is a more efficient way to do this.
```{r dev='jpeg'}
library("dprint")
k <- data.frame(matrix(1:100, 10,10))
CBs <- style(frmt.bdy=frmt(fontfamily="HersheySans"),
frmt.tbl=frmt(bty="o", lwd=1),
frmt.col=frmt(fontfamily="HersheySans", bg="khaki",
fontface="bold", lwd=2, bty="_"),
frmt.grp=frmt(fontfamily="HersheySans",bg="khaki",
fontface="bold"),
frmt.main=frmt(fontfamily="HersheySans", fontface="bold",
fontsize=12),
frmt.ftn=frmt(fontfamily="HersheySans"),
justify="right", tbl.buf=0)
x <- dprint(~., data=k, style=CBs, pg.dim = c(7, 4.5),
showmargins=TRUE, newpage=TRUE,
main="TABLE TITLE", fit.width=TRUE)
```
Update
Playing around to determine the sizes of the images is a total drag. But, if you run the code in R and look at the structure of x, you'll find the following:
str(x)
# List of 3
# $ cord1 : num [1:2] 0.2 6.8
# $ cord2 : Named num [1:2] 3.42 4.78
# ..- attr(*, "names")= chr [1:2] "" ""
# $ pagenum: num 2
Or, simply:
x$cord2
# 3.420247 4.782485
These are the dimensions of your resulting image, and this information can probably easily be plugged into a function to make your plots better.
Good luck!
So here's my solution...with some examples...
I've just copied and pasted my Rmd file to demonstrate how to use it.
you should be able to just copy and paste it into a blank Rmd file and then knit to HTML to see the results...
Ideally what I would have liked would have been to make it all one nice neat function rather than splitting it up into two (i.e. setup.table & print.table) but since chunk options can't be changed mid chunk as suggested by Yihui, it had to be split up into two functions...
`dprint` + `knitr` Examples to create table images
===========
```{r}
library(dprint)
# creating the sytle object to be used
CBs <- style(frmt.bdy=frmt(fontfamily="HersheySans"),
frmt.tbl=frmt(bty="o", lwd=1),
frmt.col=frmt(fontfamily="HersheySans", bg="khaki",
fontface="bold", lwd=2, bty="_"),
frmt.grp=frmt(fontfamily="HersheySans",bg="khaki",
fontface="bold"),
frmt.main=frmt(fontfamily="HersheySans", fontface="bold",
fontsize=12),
frmt.ftn=frmt(fontfamily="HersheySans"),
justify="right", tbl.buf=0)
# creating a setup function to setup printing a table (will probably put this function into my .Rprofile file)
setup.table <- function(df,width=10, style.obj='CBs'){
require(dprint)
table.style <- get(style.obj)
a <- tbl.struct(~., df)
b <- char.dim(a, style=table.style)
p <- pagelayout(dtype = "rgraphics", pg.dim = NULL, margins = NULL)
f <- size.simp(a[[1]], char.dim.obj=b, loc.y=0, pagelayout=p)
# now to work out the natural table width to height ratio (w.2.h.r) GIVEN the style
w.2.h.r <- as.numeric(f$tbl.width/(f$tbl.height +b$linespace.col+ b$linespace.main))
height <- width/w.2.h.r
table.width <- width
table.height <- height
# Setting chunk options to have right fig dimensions for the next chunk
opts_chunk$set('fig.width'=as.numeric(width+0.1))
opts_chunk$set('fig.height'=as.numeric(height+0.1))
# assigning relevant variables to be used when printing
assign("table.width",table.width, envir=.GlobalEnv)
assign("table.height",table.height, envir=.GlobalEnv)
assign("table.style", table.style, envir=.GlobalEnv)
}
# function to print the table (will probably put this function into my .Rprofile file as well)
print.table <- function(df, row.2.hl='2012-04-30', colour='lightblue',...) {
x <-dprint(~., data=df, style=table.style, pg.dim=c(table.width,table.height), ..., newpage=TRUE,fit.width=TRUE, row.hl=row.hl(which(df[,1]==row.2.hl), col=colour))
}
```
```{r}
# Giving it a go!
# Setting up two differnt size tables
small.df <- data.frame(matrix(1:100, 10,10))
big.df <- data.frame(matrix(1:800,40,20))
```
```{r}
# Using the created setup.table function
setup.table(df=small.df, width=10, style.obj='CBs')
```
```{r}
# Using the print.table function
print.table(small.df,4,'lightblue',main='table title string') # highlighting row 4
```
```{r}
setup.table(big.df,13,'CBs') # now setting up a large table
```
```{r}
print.table(big.df,38,'orange', main='the big table!') # highlighting row 38 in orange
```
```{r}
d <- style() # the default style this time will be used
setup.table(big.df,15,'d')
```
```{r}
print.table(big.df, 23, 'indianred1') # this time higlihting row 23
```
I am using the LibSVM tool for my support vector classification implementation:-
The first line in my input data file looks as so:-
+1 15752:47 6279:45 475:40 5231:30 515:29 7529:28 11623:24 274:24 15431:21 7342:20 4819:20 7598:18 8853:17 11134:16 501:16 911:15 4656:15 5875:14 10725:13 7334:13 13762:13 8295:12 9314:12 317:12 10641:12 2690:12 8771:12 4698:11 11519:10 10069:9 10019:8 1120:8 15017:8 254:8 7900:8 5395:8 486:8 1763:8 11183:7 9163:7 9219:7 1827:7 11901:7 4068:6 15592:6 9925:6 3464:5 8408:5 15348:5 8432:5 10064:5 6319:4 5729:4 8334:4 11817:4 6238:4 4521:4 11761:4 328:4 15876:4 6494:4 280:4 14628:4 5514:4 6383:4 9149:4 2456:4 6741:4 482:4 2773:4 10873:3 8715:3 8802:3 11478:3 11848:3 12269:3 10592:3 12911:3 11051:3 10798:3 8412:3 232:3 7654:3 1210:3 502:3 12687:3 14459:2 2725:2 9851:2 5799:2 16046:2 3612:2 1440:2 8503:2 245:2 9780:2 322:2 11902:2 8977:2 14949:2 5710:2 6423:2 9896:2 5507:2 10646:2 9932:2 14894:2 3997:2 13429:2 9845:2 8547:2 2720:2 861:2 2830:2 5703:2 6994:2 13973:2 3086:2 262:2 7793:2 208:2 3221:2 13229:2 13350:2 372:2 10384:2 3970:2 13506:2 9720:2 8981:2 9296:1 10276:1 15098:1 6631:1 383:1 6510:1 13304:1 9646:1 8233:1 1080:1 8537:1 12129:1 10711:1 14569:1 2969:1 1215:1 12435:1 7689:1 12626:1 14609:1 13474:1 4488:1 103:1 621:1 12430:1 617:1 514:1 11673:1 215:1 8817:1 10968:1 4717:1 1807:1 5737:1 3156:1 14320:1 13457:1 12411:1 9596:1 15028:1 10531:1 4301:1 4799:1 6013:1 7619:1 6717:1 9344:1 1817:1 15868:1 11307:1 9632:1 6945:1 9916:1 11899:1 883:1 11696:1 14503:1 316:1 4012:1 9994:1 8501:1 1847:1 12534:1 14966:1 11800:1 8093:1 13403:1 7309:1 5957:1 6538:1 2535:1 7042:1 13792:1 15001:1 4894:1 4921:1 13739:1 15875:1 15802:1 14253:1 10376:1 974:1 1882:1 2397:1 8105:1 4725:1 7707:1 7506:1 9749:1 8640:1 12566:1
The name of my input data file is --> a1a
I tried to run the program on my windows command prompt as
svm-train a1a
I get the following error
Wrong input format at line 1
Could somebody help me out here? I can't seem to figure out what's wrong.
Thanks.
The feature numbers (14253, 10376, etc) have to be listed in increasing order. Once you do that, svm-train will take that data. So, for example, your file needs to begin:
+1 103:1 208:2 215:1 232:3 245:2 254:8 262:2 274:24 280:4 316:1 317:12 322:2 328:4 372:2 383:1 475:40 482:4 486:8 501:16 502:3 514:1 515:29 617:1 621:1 861:2 883:1 911:15 974:1 1080:1 1120:8 1210:3 1215:1 1440:2 1763:8 1807:1 1817:1 1827:7 1847:1 1882:1 2397:1 2456:4 2535:1 2690:12 2720:2 2725:2 2773:4 2830:2 2969:1 3086:2 3156:1 3221:2 3464:5 3612:2 3970:2 3997:2 4012:1 4068:6 4301:1 4488:1 4521:4 4656:15 4698:11 4717:1 4725:1 4799:1 4819:20 4894:1 4921:1 5231:30 5395:8 5507:2 5514:4 5703:2 5710:2 5729:4 5737:1 5799:2 5875:14 5957:1 6013:1 6238:4 6279:45 6319:4 6383:4 6423:2 6494:4 6510:1 6538:1 6631:1 6717:1 6741:4 6945:1 6994:2 7042:1 7309:1 7334:13 7342:20 7506:1 7529:28 7598:18 7619:1 7654:3 7689:1 7707:1 7793:2 7900:8 8093:1 8105:1 8233:1 8295:12 8334:4 8408:5 8412:3 8432:5 8501:1 8503:2 8537:1 8547:2 8640:1 8715:3 8771:12 8802:3 8817:1 8853:17 8977:2 8981:2 9149:4 9163:7 9219:7 9296:1 9314:12 9344:1 9596:1 9632:1 9646:1 9720:2 9749:1 9780:2 9845:2 9851:2 9896:2 9916:1 9925:6 9932:2 9994:1 10019:8 10064:5 10069:9 10276:1 10376:1 10384:2 10531:1 10592:3 10641:12 10646:2 10711:1 10725:13 10798:3 10873:3 10968:1 11051:3 11134:16 11183:7 11307:1 11478:3 11519:10 11623:24 11673:1 11696:1 11761:4 11800:1 11817:4 11848:3 11899:1 11901:7 11902:2 12129:1 12269:3 12411:1 12430:1 12435:1 12534:1 12566:1 12626:1 12687:3 12911:3 13229:2 13304:1 13350:2 13403:1 13429:2 13457:1 13474:1 13506:2 13739:1 13762:13 13792:1 13973:2 14253:1 14320:1 14459:2 14503:1 14569:1 14609:1 14628:4 14894:2 14949:2 14966:1 15001:1 15017:8 15028:1 15098:1 15348:5 15431:21 15592:6 15752:47 15802:1 15868:1 15875:1 15876:4 16046:2