Tidyeval create new variable from a paste statement - tidyeval

I want a custom function to take a number, then paste a prefix in front of the number to make it a variable name, then do operations with said variable name, that already exists in the data (not a newly created variable).
This is a weird hypothetical example, but in the data I'm working with I have to do some recoding where variable names include numbers and there are sequential patterns.
mtcars_revid <- mtcars %>% mutate(blah1234=drat)
test_func <- function(data,initial_var_num,var_name) {
main_var <- paste0("blah",initial_var_num)
data %>%
mutate({{var_name}}:=main_var,
{{var_name}}:=ifelse({{var_name}}<999998,{{var_name}},NA))
}
mtcars_revid %>%
test_func(1234,new_variable_name) %>%
summarize(test_var_mean=mean(new_variable_name),
correct_mean=mean(blah1234))

Related

Why doesn't this data.table function modify the argument? [duplicate]

I'm writing a function that, among other things, coerces the input into a data.table.
library(data.table)
df <- data.frame(id = 1:10)
f <- function(df){setDT(df)}
f(df)
df[, temp := 1]
However, the last command outputs the following warning:
Warning message: In [.data.table(df, , :=(temp, 1)) : Invalid
.internal.selfref detected and fixed by taking a copy of the whole
table so that := can add this new column by reference. At an earlier
point, this data.table has been copied by R (or been created manually
using structure() or similar). Avoid key<-, names<- and attr<- which
in R currently (and oddly) may copy the whole data.table. Use set*
syntax instead to avoid copying: ?set, ?setnames and ?setattr. Also,
in R<=v3.0.2, list(DT1,DT2) copied the entire DT1 and DT2 (R's list()
used to copy named objects); please upgrade to R>v3.0.2 if that is
biting. If this message doesn't help, please report to datatable-help
so the root cause can be fixed.
I'm using v1.9.3 of data.table and R 3.1.1. Does it mean df is copied at some point? How to avoid this warning?
Edit:
The code of setDT actually uses NSE. So this seems to work:
df1 <- data.frame(id = 1:10)
f <- function(df){eval(substitute(setDT(df)),parent.frame())}
f(df1)
df1[, temp := 1]
It seems I can do other stuffs with df within the function f like
df1 <- data.frame(id = 1:10)
f <- function(df){
eval(substitute(setDT(df)),parent.frame())
df[, temp := 1]
}
f(df1)
Is this the right way to do it?
Great question! The warning message should say: ... and fixed by taking a shallow copy of the whole table .... Will fix this.
setDT does two things:
set the class to data.table from data.frame/list
use alloc.col to over-allocate columns (so that := can be used directly)
And the 2nd step requires a shallow copy, if the input is not a data.table already. And this is why we assign the value back to the symbol in it's environment (setDT's parent frame). But the parent frame for setDT is your function f(). Therefore the setDT(df) within your function has gone through smoothly, but the df that resides in the global environment will only have it's class changed, not the over-allocation (as the shallow copy severed the link).
And in the next step, := detects that and shallow copies once again to over-allocate.
The idea so far is to use setDT to convert to data.tables before providing it to a function. But I'd like that these cases be resolved (will take a look).
Thanks a bunch!

Performing a calculation on several data frames with a for loop

I have a group dataframes I want to create a for loop for that will perform a calculation on all of them without having to manually enter the name of the dataframe each time.
example:
df1
df2
df3
#first I try to create a list of the dataframe names to iterate through
dflist <- list(c(df1, df2, df3))
Then I attempt to iterate through it including the calculation. Simplified version here:
for (i in 1:length(dflist)) {
x <- dflist[i]$columnone[1] %>%
y <- dflist[i]$columntwo[1] %>%
z <- mean(dflist[i]$columnthree) %>%
paste0("result_",i) <- x-y/z
}
I keep being told that z cannot be found.
What am I doing wrong?
(the paste0 line at the end is meant to store the result for each dataframe as its own new variable but is not the focus of the question)

Quantstrat applystrategy incorrect dimensions trying to work with manual mktdata OHCLV data vs getSymbols

I apologize for not having a working example atm
All I really need is a sample format for how to load multiple symbols from a csv
The function call says
https://www.rdocumentation.org/packages/quantstrat/versions/0.16.7/topics/applyStrategy
mktdata
"an xts object containing market data. depending on indicators, may need to be in OHLCV or BBO formats, default NULL"
The reason I don't wish to use getSymbols is because I do some preprocessing and load the data from csv's because my internet is shoddy. I do download data, but about once a week. My preprocess produces different symbols from a subset of 400 symbols based on the time periods I scan. I'm trying to frontload all my download processing, and no matter what I try, I can't get it to load from either a dataframe or an xts object. Right now I'm converting from csv to dataframe to xts and attempting to load.
I have noticed my xts objects differ from the getSymbols (error about incorrect dimensions). Specifically if I call colnames. Mine will say none, where as getSymbols subelements list 6 columns.
Anyways. What I would like to do, is see a minimal example of loading custom OHCLV data from a csv into an xts that can be supplied as an object to mktdata = in the applyStrategy call. That way I can format my code to match
I have the code to load and create the xts object from a dataframe.
#loads from a dataframe which includes Symbol, Date, Open, High, Low, Close, Volume, Adjusted
tempData <- symbol_data_set[symbol_data_set$Symbol %in% symbolstring & symbol_data_set$Date >= startDate & symbol_data_set$Date<=endDate,]
#creates a list of xts
vectorXTS <- mclapply(symbolstring,function(x)
{
df <- symbol_data_set[symbol_data_set$Symbol==x & symbol_data_set$Date >= startDate & symbol_data_set$Date<=endDate,]
#temp <- as.xts(
temp <- cbind(as.data.frame(df[,2]),as.data.frame(df[,-1:-2]))
rownames(df) <- df$Date
#,order.by=as.POSIXct(df$Date),)
z <- read.zoo(temp, index = 1, col.names=TRUE, header = TRUE)
#sets names to Symbol.Open ...
colnames(z) <- c(paste0(symbolstring[x],".Open"),paste0(symbolstring[x],".High"),paste0(symbolstring[x],".Low"),paste0(symbolstring[x],".Close"),paste0(symbolstring[x],".Volume"),paste0(symbolstring[x],".Adjusted"))
return(as.xts(z, match.to=AAPL))
#colnames(as.xts(z))
})
names(symbolstring) <- symbolstring
names(vectorXTS) <- symbolstring
for(i in symbolstring) assign(symbolstring[i],vectorXTS[i])
colnames(tempData) <- c(paste0(x,".Symbol"),paste0(x,".Date"),paste0(x,".Open"),paste0(x,".High"),paste0(x,".Low"),paste0(x,".Close"),paste0(x,".Volume"),paste0(x,".Adjusted"))
head(tempData)
rownames(tempData) <- tempData$Date
#attempts to use this xts object I created
results <- applyStrategy(strategy= strategyName, portfolios = portfolioName,symbols=symbolstring,mktdata)
error
Error in mktdata[, keep] : incorrect number of dimensions
This is how you store an xts getSymbols object in a file and reload it for use for quantStrat's applyStrategy (two methods shown, the read.xts method is the ideal as you can see how the csv's are stored)
getSymbols("AAPL",from=startDate,to=endDate,adjust=TRUE,src='yahoo',auto.assign = TRUE)
saveRDS(AAPL, file= 'stuff.Rdata')
AAPL <- readRDS(file= 'stuff.Rdata')
write.zoo(AAPL,file="zoo.csv", index.name = "Date", row.names=FALSE)
rm(AAPL)
AAPL <- as.xts(read.zoo(file="zoo.csv",header = TRUE))
If you want to work with multiple symbols, I had this work.
Note initially I had a reference to the 1st element, i.e. vectorXTS[[1]], and it worked
Note: at least setting it up like this got it to run...
vectorXTS <- mclapply(symbolstring,function(x)
{
df <- symbol_data_set[symbol_data_set$Symbol==x & symbol_data_set$Date >= startDate & symbol_data_set$Date<=endDate,]
temp <- cbind(as.data.frame(df[,2]),as.data.frame(df[,-1:-2]))
rownames(df) <- df$Date
z <- read.zoo(temp, index = 1, col.names=TRUE, header = TRUE)
colnames(z) <- c(paste0(x,".Open"),paste0(x,".High"),paste0(x,".Low"),paste0(x,".Close"),paste0(x,".Volume"),paste0(x,".Adjusted"))
write.zoo(z,file=paste0(x,"zoo.csv"), index.name = "Date", row.names=FALSE)
return(as.xts(read.zoo(file=paste0(x,"zoo.csv"),header = TRUE)))
})
names(vectorXTS) <- symbolstring
#this will assign to memory vs vectorXTS if one wishes to avoid using mktdata = vectorXTS[[]]
for(i in symbolstring) assign(i,vectorXTS[[i]])
results <- applyStrategy(strategy= strategyName, portfolios = portfolioName,symbols=symbolstring, mktdata = vectorXTS[[]])
#alternatively
#results <- applyStrategy(strategy= strategyName, portfolios = portfolioName,symbols=symbolstring)

R Shiny- How to apply user inserted text to run a function (R code) and produce final ggplot output

I have a basic R code where, within the code, a user can enter a country name "Argentina". Using that value/name, the code will run an analysis for the "Argentina" subset of the pre-loaded data. Finally, the code will produce a simple ggplot showing results.
I have tried to make this code into a Shiny App, however I cannot get it to work properly. My main issue is that I cannot seem to get the data analysis in the Server section to work, which should subsequently feed into the plotting code. More importantly, I cannot seem to get the user inputted country name to feed into my data analysis.
Without going into the detail of the code, could someone kindly point me in the right direction of how one would do this in Shiny? e.g.
1) Field for user input;
2) Use that user input as an object used in the code;
3) subsequently run the analysis (whatever it might be); and
4) use the final analysis data frame in ggplot for a figure output to be displayed in the shiny app.
Many thanks for your assistance and time.
Please see my shinny code currently used, with reproducible data using MTcars
library(shiny)
# Some Sample data to run app using mtcars
mtcars$Primary<- rownames(mtcars)
mtcars$Area <- "Argentina"
mtcars$Y2016<- mtcars$mpg
mtcars$Element <- "Gross Production Value (constant 2004-2006 million US$)"
# Defining UI ----
ui <- pageWithSidebar(
# App title ----
headerPanel("Subsector Selection Tool"),
# Sidebar panel for inputs ----
sidebarPanel(
# Input: Country name
textInput("country", "Please enter country name", "")#,
),
# Main panel for displaying outputs ----
mainPanel("")
)
# Define server logic to plot various variables against mpg ----
server <- function(input, output) {
#Trying to make user inputed country name into an object to be used in
"reactive" code below, which in turn is be used to make dataset for graphing
country_interest <- reactive({
paste(input$country)
})
#Here I am trying to make the data analysis code run and create desired
dataset for graphing, and subsetting for country selected by user
Value_c_e_PRIM_x <- reactive({
Value_c <- Value[which(Value$Area==country_interest),]
Value_c_e <- Value_c[which(Value_c$Element=="Gross Production Value (constant 2004-2006 million US$)"),]
Value_c_e_PRIM$Primary <- Value_c_e_PRIM[,120]
Value_c_e_PRIM[,120] <- NULL
Value_c_e_PRIM <- Value_c_e_PRIM %>% group_by(Primary,Element) %>% summarise_at(vars(Y2016), sum)
Value_c_e_PRIM$Category <- "Value of Production"
Value_c_e_PRIM$Value <- Value_c_e_PRIM$Y2016
Value_c_e_PRIM <- Value_c_e_PRIM %>% group_by(Category,Primary) %>% summarise_at(vars(Value), mean)
})
#Graphing section, if Ihave the dataset "Value_c_e_PRIM_x" pre-loaded (e.g. not derived in code above), the figure is successfully shown in the output.
output$plot <- renderPlot({
Graph_data <- Value_c_e_PRIM_x
Graph_data$Score_type <- "Competitiveness Score"
Graph_data$`Competitiveness Score` <- round(Graph_data$Value, 2)
title1 <-paste("Competitiveness\nby",paste0(country_interest),"Subsector")
mycol <-c("red", "yellow", "#006600")
ggplot(data = Graph_data, aes(x = Score_type, y = reorder(Primary,Value), fill=Value)) +
geom_tile(aes(fill = Value), colour= "white")+
geom_text(data=Graph_data,aes(y=Primary, x= Score_type, label=Value))+
labs(title =(paste0(title1)),y = "", x = "")+
scale_fill_gradientn(colours = mycol)+
theme(legend.title=element_blank())+
theme(legend.position="bottom")
})
}
shinyApp(ui, server)

For Loop in Shiny Server: How to Not Overwrite Values with Each ActionButton Press?

I am trying to create an app in which part of the UI displays a wordcloud generated by words/strings inputted by the user. To do this, I pass the input to a for loop which is supposed to then store every input in an empty vector with ever press of the action button. However, I am encountering a couple problems, though: one in that no word cloud is displaying, with no error indicated, and another in that the for loop will just overwrite the vector each time the button is pressed, such that it always only has one word in it instead of gradually adding more words. I figured the lack of display is because there is only one word, and it seems like wordcloud needs at least two words to print anything: so how can I get the for loop to work as intended with Shiny?
library(shiny)
library(stringr)
library(stringi)
library(wordcloud2)
ui <- fluidPage(
titlePanel("Strings Sim"),
sidebarLayout(
sidebarPanel(
textInput("string.input", "Create a string:", placeholder = "string <-"),
actionButton("go1", "GO!")
),
mainPanel(
textOutput("dummy"),
wordcloud2Output("the.cloud")
)
)
)
server <- function(input, output, session) {
observeEvent(input$go1, {
high.strung <- as.vector(input$string.input)
empty.words <- NULL
for (i in high.strung) {
empty.words <- c(empty.words, i)
}
word.vector <-matrix(empty.words, nrow = length(empty.words),ncol=1)
num.vector <- matrix(sample(1000), nrow=length(empty.words),ncol=1)
prelim <- cbind(word.vector, num.vector)
prelim.data <- as.data.frame(prelim)
prelim.data$V2 <- as.numeric(as.character(prelim.data$V2))
output$the.cloud <- renderWordcloud2(
wordcloud2(prelim.data)
)
print(empty.words)
})
}
shinyApp(ui=ui,server=server)
The operation works as intended when I run it without Shiny code; I basically just use a string in place of the input, run through the for loop a few times to generate the dataframe to be used by word cloud, and get something like the attached picture, which is what I am after:
Functional code without Shiny:
empty.words <- NULL
#Rerun below here to populate vector with more words and regenerate wordcloud
high.strung <- as.vector("gumbo")
for (i in high.strung) {
empty.words <- c(empty.words, i)
return(empty.words)
}
word.vector <-matrix(empty.words, nrow = length(empty.words),ncol=1)
num.vector <- matrix(sample(1000), nrow=length(empty.words),ncol=1)
prelim <- cbind(word.vector, num.vector)
prelim.data <- as.data.frame(prelim)
prelim.data$V2 <- as.numeric(as.character(prelim.data$V2))
str(prelim.data)
wordcloud2(prelim.data)
Any help is much appreciated!
Edit: More pictures of the desired output using the non-Shiny code. (I editted the dataframe output to overlay the wordcloud just to show the cloud and frame in one picture, i.e. don't need them to display in that way). With each press of the button, the inputted word(s) should be added to the dataframe that builds the cloud, gradually making it larger.The random number vector which determines the size doesn't have to stay the same with each press, but each inputted word should be preserved in a vector.
Your app is missing reactivity. You can read about that concept here. You can input strings and as soon as at least two words are in the dataframe the wordcloud is rendered. If you don't want multi-word strings to be split just take out the str_split() function.
library(shiny)
library(stringr)
library(stringi)
library(wordcloud2)
ui <- fluidPage(
titlePanel("Strings Sim"),
sidebarLayout(
sidebarPanel(
textInput("string.input", "Create a string:", placeholder = "string <-"),
actionButton("go1", "GO!")
),
mainPanel(
textOutput("dummy"),
wordcloud2Output("the.cloud")
)
)
)
server <- function(input, output, session) {
rv <- reactiveValues(high.strung = NULL)
observeEvent(input$go1, {
rv$high.strung <- c(rv$high.strung,str_split(c(input$string.input), pattern = " ") %>% unlist)
})
prelim.data <- reactive({
prelim <- data.frame(
word.vector = rv$high.strung,
num.vector = sample(1000, length(rv$high.strung), replace = TRUE)
)
})
output$the.cloud <- renderWordcloud2(
if (length(rv$high.strung) > 0)
wordcloud2(prelim.data())
)
}
shinyApp(ui=ui,server=server)

Resources