Change row names in data frame in R - for-loop

I've got the following code to iterate through directory/subdirectory, pick out certain files, read a value in them, and populate a new data frame with those values. It works, with a few issues...
Here's the code
wd = setwd("/Users/TK/Downloads/DataCSV")
Groups <- list.dirs(path = wd, full.names = TRUE, recursive = FALSE)
Subj <- list.dirs(path = Groups, full.names = TRUE, recursive = FALSE)
section_area_vector <- numeric()
for(i in Subj) {
setwd(i)
section_area <- list.files(path = i, pattern = "section_area",
full.names = FALSE, recursive = TRUE)
read_area <- sapply(section_area, function(x)read.csv(x)[1,2])
total_area_subj <- sum(read_area)
section_area_vector <- rbind(section_area_vector, total_area_subj)
}
section_area_data <- as.data.frame(section_area_vector)
colnames(section_area_data)[colnames(section_area_data) ==
"V1"] <- "Area"
The output looks like this table:
How do I get the row names to appear as subj.1, subj.2, subj.3
Also, I seem to have to run the code twice, with the first time it not working (basically a null result), but the second time it works and yields the table - any ideas why this might be?
Also, is this the best way to write this task, or is there something more elegant? I know "for loops" are frowned upon as they are slow (eventually there will be lots of data to work with)...tried using sapply functions but got lost in the syntax. Would love some suggestions if this code can be improved.

Related

Answering the Longest Substring Without Repeating Characters in Kotlin

I've spend some time working on the problem and got this close
fun lengthOfLongestSubstring(s: String): Int {
var set = HashSet<Char>()
var initalChar = 0
var count = 0
s.forEach {r ->
while(!set.add(s[r]))
set.remove(s[r])
initalChar++
set.add(s[r])
count = maxOf(count, r - initialChar + 1)
}
return count
}
I understand that a HashSet is needed to answer the question since it doesn't allow for repeating characters but I keep getting a type mismatch error. I'm not above being wrong. Any assistance will be appreciated.
Your misunderstanding is that r represents a character in the string, not an index of the string, so saying s[r] doesn't make sense. You just mean r.
But you are also using r on its own, so you should be using forEachIndexed, which lets you access both the element of the sequence and the index of that element:
s.forEach { i, r ->
while(!set.add(r))
set.remove(r)
initialChar++
set.add(r)
count = maxOf(count, i - initialChar + 1)
}
Though there are still some parts of your code that doesn't quite make sense.
while(!set.add(r)) set.remove(r) is functionally the same as set.add(r). If add returns false, that means the element is already in the set, you remove it and the next iteration of the loop adds the element back into the set. If add returns true, that means the set didn't have the element and it was successfully added, so in any case, the result is you add r to the set.
And then you do set.add(r) again two lines later for some reason?
Anyway, here is a brute-force solution that you can use as a starting point to optimise:
fun lengthOfLongestSubstring(s: String): Int {
val set = mutableSetOf<Char>()
var currentMax = 0
// for each substring starting at index i...
for (i in s.indices) {
// update the current max from the previous iterations...
currentMax = maxOf(currentMax, set.size)
// clear the set to record a new substring
set.clear()
// loop through the characters in this substring
for (j in i..s.lastIndex) {
if (!set.add(s[j])) { // if the letter already exists
break // go to the next iteration of the outer for loop
}
}
}
return maxOf(currentMax, set.size)
}

Quantstrat applystrategy incorrect dimensions trying to work with manual mktdata OHCLV data vs getSymbols

I apologize for not having a working example atm
All I really need is a sample format for how to load multiple symbols from a csv
The function call says
https://www.rdocumentation.org/packages/quantstrat/versions/0.16.7/topics/applyStrategy
mktdata
"an xts object containing market data. depending on indicators, may need to be in OHLCV or BBO formats, default NULL"
The reason I don't wish to use getSymbols is because I do some preprocessing and load the data from csv's because my internet is shoddy. I do download data, but about once a week. My preprocess produces different symbols from a subset of 400 symbols based on the time periods I scan. I'm trying to frontload all my download processing, and no matter what I try, I can't get it to load from either a dataframe or an xts object. Right now I'm converting from csv to dataframe to xts and attempting to load.
I have noticed my xts objects differ from the getSymbols (error about incorrect dimensions). Specifically if I call colnames. Mine will say none, where as getSymbols subelements list 6 columns.
Anyways. What I would like to do, is see a minimal example of loading custom OHCLV data from a csv into an xts that can be supplied as an object to mktdata = in the applyStrategy call. That way I can format my code to match
I have the code to load and create the xts object from a dataframe.
#loads from a dataframe which includes Symbol, Date, Open, High, Low, Close, Volume, Adjusted
tempData <- symbol_data_set[symbol_data_set$Symbol %in% symbolstring & symbol_data_set$Date >= startDate & symbol_data_set$Date<=endDate,]
#creates a list of xts
vectorXTS <- mclapply(symbolstring,function(x)
{
df <- symbol_data_set[symbol_data_set$Symbol==x & symbol_data_set$Date >= startDate & symbol_data_set$Date<=endDate,]
#temp <- as.xts(
temp <- cbind(as.data.frame(df[,2]),as.data.frame(df[,-1:-2]))
rownames(df) <- df$Date
#,order.by=as.POSIXct(df$Date),)
z <- read.zoo(temp, index = 1, col.names=TRUE, header = TRUE)
#sets names to Symbol.Open ...
colnames(z) <- c(paste0(symbolstring[x],".Open"),paste0(symbolstring[x],".High"),paste0(symbolstring[x],".Low"),paste0(symbolstring[x],".Close"),paste0(symbolstring[x],".Volume"),paste0(symbolstring[x],".Adjusted"))
return(as.xts(z, match.to=AAPL))
#colnames(as.xts(z))
})
names(symbolstring) <- symbolstring
names(vectorXTS) <- symbolstring
for(i in symbolstring) assign(symbolstring[i],vectorXTS[i])
colnames(tempData) <- c(paste0(x,".Symbol"),paste0(x,".Date"),paste0(x,".Open"),paste0(x,".High"),paste0(x,".Low"),paste0(x,".Close"),paste0(x,".Volume"),paste0(x,".Adjusted"))
head(tempData)
rownames(tempData) <- tempData$Date
#attempts to use this xts object I created
results <- applyStrategy(strategy= strategyName, portfolios = portfolioName,symbols=symbolstring,mktdata)
error
Error in mktdata[, keep] : incorrect number of dimensions
This is how you store an xts getSymbols object in a file and reload it for use for quantStrat's applyStrategy (two methods shown, the read.xts method is the ideal as you can see how the csv's are stored)
getSymbols("AAPL",from=startDate,to=endDate,adjust=TRUE,src='yahoo',auto.assign = TRUE)
saveRDS(AAPL, file= 'stuff.Rdata')
AAPL <- readRDS(file= 'stuff.Rdata')
write.zoo(AAPL,file="zoo.csv", index.name = "Date", row.names=FALSE)
rm(AAPL)
AAPL <- as.xts(read.zoo(file="zoo.csv",header = TRUE))
If you want to work with multiple symbols, I had this work.
Note initially I had a reference to the 1st element, i.e. vectorXTS[[1]], and it worked
Note: at least setting it up like this got it to run...
vectorXTS <- mclapply(symbolstring,function(x)
{
df <- symbol_data_set[symbol_data_set$Symbol==x & symbol_data_set$Date >= startDate & symbol_data_set$Date<=endDate,]
temp <- cbind(as.data.frame(df[,2]),as.data.frame(df[,-1:-2]))
rownames(df) <- df$Date
z <- read.zoo(temp, index = 1, col.names=TRUE, header = TRUE)
colnames(z) <- c(paste0(x,".Open"),paste0(x,".High"),paste0(x,".Low"),paste0(x,".Close"),paste0(x,".Volume"),paste0(x,".Adjusted"))
write.zoo(z,file=paste0(x,"zoo.csv"), index.name = "Date", row.names=FALSE)
return(as.xts(read.zoo(file=paste0(x,"zoo.csv"),header = TRUE)))
})
names(vectorXTS) <- symbolstring
#this will assign to memory vs vectorXTS if one wishes to avoid using mktdata = vectorXTS[[]]
for(i in symbolstring) assign(i,vectorXTS[[i]])
results <- applyStrategy(strategy= strategyName, portfolios = portfolioName,symbols=symbolstring, mktdata = vectorXTS[[]])
#alternatively
#results <- applyStrategy(strategy= strategyName, portfolios = portfolioName,symbols=symbolstring)

For Loop in Shiny Server: How to Not Overwrite Values with Each ActionButton Press?

I am trying to create an app in which part of the UI displays a wordcloud generated by words/strings inputted by the user. To do this, I pass the input to a for loop which is supposed to then store every input in an empty vector with ever press of the action button. However, I am encountering a couple problems, though: one in that no word cloud is displaying, with no error indicated, and another in that the for loop will just overwrite the vector each time the button is pressed, such that it always only has one word in it instead of gradually adding more words. I figured the lack of display is because there is only one word, and it seems like wordcloud needs at least two words to print anything: so how can I get the for loop to work as intended with Shiny?
library(shiny)
library(stringr)
library(stringi)
library(wordcloud2)
ui <- fluidPage(
titlePanel("Strings Sim"),
sidebarLayout(
sidebarPanel(
textInput("string.input", "Create a string:", placeholder = "string <-"),
actionButton("go1", "GO!")
),
mainPanel(
textOutput("dummy"),
wordcloud2Output("the.cloud")
)
)
)
server <- function(input, output, session) {
observeEvent(input$go1, {
high.strung <- as.vector(input$string.input)
empty.words <- NULL
for (i in high.strung) {
empty.words <- c(empty.words, i)
}
word.vector <-matrix(empty.words, nrow = length(empty.words),ncol=1)
num.vector <- matrix(sample(1000), nrow=length(empty.words),ncol=1)
prelim <- cbind(word.vector, num.vector)
prelim.data <- as.data.frame(prelim)
prelim.data$V2 <- as.numeric(as.character(prelim.data$V2))
output$the.cloud <- renderWordcloud2(
wordcloud2(prelim.data)
)
print(empty.words)
})
}
shinyApp(ui=ui,server=server)
The operation works as intended when I run it without Shiny code; I basically just use a string in place of the input, run through the for loop a few times to generate the dataframe to be used by word cloud, and get something like the attached picture, which is what I am after:
Functional code without Shiny:
empty.words <- NULL
#Rerun below here to populate vector with more words and regenerate wordcloud
high.strung <- as.vector("gumbo")
for (i in high.strung) {
empty.words <- c(empty.words, i)
return(empty.words)
}
word.vector <-matrix(empty.words, nrow = length(empty.words),ncol=1)
num.vector <- matrix(sample(1000), nrow=length(empty.words),ncol=1)
prelim <- cbind(word.vector, num.vector)
prelim.data <- as.data.frame(prelim)
prelim.data$V2 <- as.numeric(as.character(prelim.data$V2))
str(prelim.data)
wordcloud2(prelim.data)
Any help is much appreciated!
Edit: More pictures of the desired output using the non-Shiny code. (I editted the dataframe output to overlay the wordcloud just to show the cloud and frame in one picture, i.e. don't need them to display in that way). With each press of the button, the inputted word(s) should be added to the dataframe that builds the cloud, gradually making it larger.The random number vector which determines the size doesn't have to stay the same with each press, but each inputted word should be preserved in a vector.
Your app is missing reactivity. You can read about that concept here. You can input strings and as soon as at least two words are in the dataframe the wordcloud is rendered. If you don't want multi-word strings to be split just take out the str_split() function.
library(shiny)
library(stringr)
library(stringi)
library(wordcloud2)
ui <- fluidPage(
titlePanel("Strings Sim"),
sidebarLayout(
sidebarPanel(
textInput("string.input", "Create a string:", placeholder = "string <-"),
actionButton("go1", "GO!")
),
mainPanel(
textOutput("dummy"),
wordcloud2Output("the.cloud")
)
)
)
server <- function(input, output, session) {
rv <- reactiveValues(high.strung = NULL)
observeEvent(input$go1, {
rv$high.strung <- c(rv$high.strung,str_split(c(input$string.input), pattern = " ") %>% unlist)
})
prelim.data <- reactive({
prelim <- data.frame(
word.vector = rv$high.strung,
num.vector = sample(1000, length(rv$high.strung), replace = TRUE)
)
})
output$the.cloud <- renderWordcloud2(
if (length(rv$high.strung) > 0)
wordcloud2(prelim.data())
)
}
shinyApp(ui=ui,server=server)

Listing functions with debug flag set in R

I am trying to find a global counterpart to isdebugged() in R. My scenario is that I have functions that make calls to other functions, all of which I've written, and I am turning debug() on and off for different functions during my debugging. However, I may lose track of which functions are set to be debugged. When I forget and start a loop, I may get a lot more output (nuisance, but not terrible) or I may get no output when some is desired (bad).
My current approach is to use a function similar to the one below, and I can call it with listDebugged(ls()) or list the items in a loaded library (examples below). This could suffice, but it requires that I call it with the list of every function in the workspace or in the packages that are loaded. I can wrap another function that obtains these. It seems like there should be an easier way to just directly "ask" the debug function or to query some obscure part of the environment where it is stashing the list of functions with the debug flag set.
So, a two part question:
Is there a simpler call that exists to query the functions with the debug flag set?
If not, then is there any trickery that I've overlooked? For instance, if a function in one package masks another, I suspect I may return a misleading result.
I realize that there is another method I could try and that is to wrap debug and undebug within functions that also maintain a hidden list of debugged function names. I'm not yet convinced that's a safe thing to do.
UPDATE (8/5/11): I searched SO, and didn't find earlier questions. However, SO's "related questions" list has shown that an earlier question that is similar, though the function in the answer for that question is both more verbose and slower than the function offered by #cbeleites. The older question also doesn't provide any code, while I did. :)
The code:
listDebugged <- function(items){
isFunction <- vector(length = length(items))
isDebugged <- vector(length = length(items))
for(ix in seq_along(items)){
isFunction[ix] <- is.function(eval(parse(text = items[ix])))
}
for(ix in which(isFunction == 1)){
isDebugged[ix] <- isdebugged(eval(parse(text = items[ix])))
}
names(isDebugged) <- items
return(isDebugged)
}
# Example usage
listDebugged(ls())
library(MASS)
debug(write.matrix)
listDebugged(ls("package:MASS"))
Here's my throw at the listDebugged function:
ls.deb <- function(items = search ()){
.ls.deb <- function (i){
f <- ls (i)
f <- mget (f, as.environment (i), mode = "function",
## return a function that is not debugged
ifnotfound = list (function (x) function () NULL)
)
if (length (f) == 0)
return (NULL)
f <- f [sapply (f, isdebugged)]
f <- names (f)
## now check whether the debugged function is masked by a not debugged one
masked <- !sapply (f, function (f) isdebugged (get (f)))
## generate pretty output format:
## "package::function" and "(package::function)" for masked debugged functions
if (length (f) > 0) {
if (grepl ('^package:', i)) {
i <- gsub ('^package:', '', i)
f <- paste (i, f, sep = "::")
}
f [masked] <- paste ("(", f [masked], ")", sep = "")
f
} else {
NULL
}
}
functions <- lapply (items, .ls.deb)
unlist (functions)
}
I chose a different name, as the output format are only the debugged functions (otherwise I easily get thousands of functions)
the output has the form package::function (or rather namespace::function but packages will have namespaces pretty soon anyways).
if the debugged function is masked, output is "(package::function)"
the default is looking throught the whole search path
This is a simple one-liner using lsf.str:
which(sapply(lsf.str(), isdebugged))
You can change environments within the function, see ?lsf.str for more arguments.
Since the original question, I've been looking more and more at Mark Bravington's debug package. If using that package, then check.for.traces() is the appropriate command to list those functions that are being debugged via mtrace.
The debug package is worth a look if one is spending much time with the R debugger and various trace options.
#cbeleites I like your answer, but it didn't work for me. I got this to work but it is less functional than yours above (no recursive checks, no pretty print)
require(plyr)
debug.ls <- function(items = search()){
.debug.ls <- function(package){
f <- ls(package)
active <- f[which(aaply(f, 1, function(x){
tryCatch(isdebugged(x), error = function(e){FALSE}, finally=FALSE)
}))]
if(length(active)==0){
return(NULL)
}
active
}
functions <- lapply (items, .debug.ls)
unlist (functions)
}
I constantly get caught in the browser window frame because of failing to undebug functions. So I have created two functions and added them to my .Rprofile. The helper functions are pretty straight forward.
require(logging)
# Returns a vector of functions on which the debug flag is set
debuggedFuns <- function() {
envs <- search()
debug_vars <- sapply(envs, function(each_env) {
funs <- names(Filter(is.function, sapply(ls(each_env), get, each_env)))
debug_funs <- Filter(isdebugged, funs)
debug_funs
})
return(as.vector(unlist(debug_vars)))
}
# Removes the debug flag from all the functions returned by `debuggedFuns`
unDebugAll <- function(verbose = TRUE) {
toUnDebug <- debuggedFuns()
if (length(toUnDebug) == 0) {
if (verbose) loginfo('no Functions to `undebug`')
return(invisible())
} else {
if (verbose) loginfo('undebugging [%s]', paste0(toUnDebug, collapse = ', '))
for (each_fn in toUnDebug) {
undebug(each_fn)
}
return(invisible())
}
}
I have tested them out, and it works pretty well. Hope this helps!

Populating a list is Scala with random double taking forever

I am new to Scala and am trying to get a list of random double values:
The thing is, when I try to run this, it takes way too long compared to its Java counterpart. Any ideas on why this is or a suggestion on a more efficient approach?
def random: Double = java.lang.Math.random()
var f = List(0.0)
for (i <- 1 to 200000)
( f = f ::: List(random*100))
f = f.tail
You can also achieve it like this:
List.fill(200000)(math.random)
the same goes for e.g. Array ...
Array.fill(200000)(math.random)
etc ...
You could construct an infinite stream of random doubles:
def randomList(): Stream[Double] = Stream.cons(math.random, randomList)
val f = randomList().take(200000)
This will leverage lazy evaluation so you won't calculate a value until you actually need it. Even evaluating all 200,000 will be fast though. As an added bonus, f no longer needs to be a var.
Another possibility is:
val it = Iterator.continually(math.random)
it.take(200000).toList
Stream also has a continually method if you prefer.
First of all, it is not taking longer than java because there is no java counterpart. Java does not have an immutable list. If it did, performance would be about the same.
Second, its taking a lot of time because appending lists have linear performance, so the whole thing has quadratic performance.
Instead of appending, prepend, which had constant performance.
if your using mutable state anyways you should use a mutable collection like buffer which you can add too with += (which then would be the real counterpart to java code).
but why dont u use list comprehension?
val f = for (_ <- 1 to 200000) yield (math.random * 100)
by the way: var f = List(0.0) ... f = f.tail can be replaced by var f: List[Double] = Nil in your example. (no more performance but more beauty ;)
Yet more options! Tail recursion:
def randlist(n: Int, part: List[Double] = Nil): List[Double] = {
if (n<=0) part
else randlist(n-1, 100*random :: part)
}
or mapped ranges:
(1 to 200000).map(_ => 100*random).toList
Looks like you want to use Vector instead of List. List has O(1) prepend, Vector has O(1) append. Since you are appending, but using concatenation, it'll be faster to use Vector:
def random: Double = java.lang.Math.random()
var f: Vector[Double] = Vector()
for (i <- 1 to 200000)
f = f :+ (random*100)
Got it?

Resources