Modifying a specific part of the name of a DT in R

Modifying a specific part of the name of a DT in R - datatable

I have the following DT
structure(list(HKU47_PSG_1_HW_0.txt = 66611.1718226969, HKU47_PSG_1_HW_1.txt = 66254.5524579138,
HKU47_PSG_1_HW_2.txt = 66972.3593176305, HKU47_PSG_1_HW_3.txt = 68419.8681965619,
HKU47_PSG_1_HW_4.txt = 66841.3761239946, HKU47_PSG_1_HW_5.txt = 66196.5383069813), .Names = c("HKU47_PSG_1_HW_0.txt",
"HKU47_PSG_1_HW_1.txt", "HKU47_PSG_1_HW_2.txt", "HKU47_PSG_1_HW_3.txt",
"HKU47_PSG_1_HW_4.txt", "HKU47_PSG_1_HW_5.txt"), row.names = c(NA,
-1L), class = c("data.table", "data.frame"), .internal.selfref = <pointer: 0x0000000000100788>)
and I would like to have a clone of this DT cancelling from each column name this part of text HW_ and .txt. I would need something like names(data) <- c("new_name", "another_new_name") which works automatically for several DT I have. I do not have actually a clear idea how to do that.

You can use sub to replace HW_ with empty string, and replace .txt with empty string, effectively removing those parts from the names:
names(data) <- sub('HW_', '', sub('\\.txt', '', names(data)))

Related

openpyxl convert scraped data in time format

I would like to convert the data scraping from an internet site, regarding the time, the data is extracted like this (for example 9:15) and inserted into the cell, I would like at the bottom of the column to make the total of the hours, the problem I would like python to convert it to numerical format so that I can add it up.
any idea?
def excel():
# Writing on a EXCEL FILE
filename = f"Monatsplan {userfinder} {month} {year}.xlsx"
try:
wb = load_workbook(filename)
ws = wb.worksheets[0] # select first worksheet
except FileNotFoundError:
headers_row = [
"Datum",
"Tour",
"Funktion",
"Von",
"Bis",
"Schichtdauer",
"Bezahlte Zeit",
]
wb = Workbook()
ws = wb.active
ws.append(headers_row)
wb.save(filename)
ws.append(
[
datumcleaned[:10],
tagesinfo,
"",
"",
"",
"",
"",
]
)
wb.save(filename)
wb.close()
excel()

You should split the data you scrapped.
time_scrapped = '9:15'
time_split = time_scrapped.split(":")
hours = int(time_split[0])
minutes = int(time_split[1])
Then you can place it in separate columns and create formula at the bottom of the column.

R psych::statsBy() error: "'x' must be numeric"

I'm trying to do a multilevel factor analysis using the "psych" package. The first step is recommended to use the statsBy() funtion to have a correlation data:
statsBy(study2, group = "ID")
However, it gives this "Error in FUN(data[x, , drop = FALSE], ...) : 'x' must be numeric".
For the dataset, I only included a grouping variable "ID", and other two numeric variables. I ran the following line to check if the varibales are numeric.
sapply(study2, is.numeric)
ID v1 V2
FALSE TRUE TRUE
Here are the code in the tracedown of the error.But I don't know what 'x' refers here, and I noticed in line 8 and 9, the X is in captital and is lowercase in line 10.
*10.
FUN(data[x, , drop = FALSE], ...)
9.
FUN(X[[i]], ...)
8.
lapply(X = ans[index], FUN = FUN, ...)
7.
tapply(seq_len(728L), list(z = c("5edfa35e60122c277654d35b", "5ed69fbc0a53140e516ad4ed", "5d52e8160ebbe900196e252e", "5efa3da57a38f213146c7352", "5ef98f3df4d541726b1bcc48", "5debb7511e806c2a59cad664", "5c28a4530091e40001ca4d00", "5872a0d958ca4c00018ce4fe", "5c87868eddda2d00012add18", "5e80b7427567f07891655e7e", ...
6.
eval(substitute(tapply(seq_len(nd), IND, FUNx, simplify = simplify)), data)
5.
eval(substitute(tapply(seq_len(nd), IND, FUNx, simplify = simplify)), data)
4.
structure(eval(substitute(tapply(seq_len(nd), IND, FUNx, simplify = simplify)), data), call = match.call(), class = "by")
3.
by.data.frame(data, z, colMeans, na.rm = na.rm)
2.
by(data, z, colMeans, na.rm = na.rm)
1.
statsBy(study2, group = "ID")*
The dataset has 728 rows and those like "5edfa35e60122c277654d35b" are IDs. Could anyone help explain what might have gone wrong?

I had the same error, the only way was to convert the group variable to the numeric class.
Try:
study2$ID<-as.numeric(study2$ID)
statsBy(study2, group = "ID")
If dat$ID is of class character:
study2$ID<-as.numeric(as.factor(study2$ID))
statsBy(study2, group = "ID")

How to update bokeh active interaction with GeoJSON as data source?

I have made an interactive choropleth map with bokeh, and I'm trying to add active interactions using the dropdown widget (Select). However, most tutorials and SO questions about active interactions use ColumnDataSource, and not GeoJSONDataSource.
The issue is that GeoJSONDataSource doesn't have a .data method like ColumnDataSource does, so idk exactly how the syntax works when updating it.
My dataset is a dictionary in the form of city_dict = {'Amsterdam': <some data frame>, 'Antwerp': <some data frame>, ...}, where the dataframe is in geojson format. I have already confirmed that this format works when making glyphs.
def update(attr, old, new):
s_value = dropdown.value
p.title.text = '%s', s_value
new_src1 = make_dataset(s_value)
val1 = GeoJSONDataSource(new_src1)
r1.data_source = val1
where make_dataset is a function that transforms my original dataset into a dataset that can feed into the GeoJSONDataSource function. make_dataset requires a string (name of the city) to work eg. 'Amsterdam'. It works on passive interactions.
The main plot code (removed unnecessary stuff) is:
dropdown = Select(value='Amsterdam', options = cities)
controls = WidgetBox(dropdown)
initial_city = 'Amsterdam'
a = make_dataset(initial_city)
src1 = GeoJSONDataSource(a)
p = figure(title = 'Amsterdam', plot_height = 750 , plot_width = 900, toolbar_location = 'right')
r1 = p.patches('xs','ys', source = src1, fill_color = {'field' :'norm', 'transform' : color_mapper})
dropdown.on_change('value', update)
layout = row(controls, p)
curdoc().add_root(layout)
I've added the error I get. error handling message Message 'PATCH-DOC' (revision 1) content: {'events': [{'kind': 'ModelChanged', 'model': {'type': 'Select', 'id': '1147'}, 'attr': 'value', 'new': 'Antwerp'}], 'references': []}: ValueError("expected a value of type str, got ('%s', 'Antwerp') of type tuple",)

How to connect leaflet map clicks (events) with plot creation in a shiny app

Hello I am creating an environmental shiny app in which I want to use a leaflet map to create some simple plots based on openair package(https://rpubs.com/NateByers/Openair).
Aq_measurements() general form
AQ<- (aq_measurements(country = “country”, city = “city”, location = “location”, parameter = “pollutant choice”, date_from = “YYYdateY-MM-DD”, date_to = “YYYY-MM-DD”).
All parameters available in locations dataframe.
worldmet() general form
met <- importNOAA(code = "12345-12345", year = YYYYY:YYYY)
NOAA Code available in locations dataframe
Below I create a sample of my initial data frame:
location = c("100 ail","16th and Whitmore","40AB01 - ANTWERPEN")
lastUpdated = c("2018-02-01 09:30:00", "2018-02-01 03:00:00", "2017-03-07 10:00:00")
firstUpdated = c("2015-09-01 00:00:00","2016-03-06 19:00:00","2016-11-22 15:00:00")
pm25=c("FALSE","FALSE","FALSE")
pm10=c("TRUE","FALSE","FALSE")
no2=c("TRUE","FALSE","FALSE")
latitude=c(47.932907,41.322470,36.809700)
longitude=c(106.92139000,-95.93799000
,-107.65170000)
df = data.frame(location, lastUpdated, firstUpdated,latitude,longitude,pm25,pm10,no2)
As a general idea I want to be able to click on a certain location in the map based on this dataframe. Then I have one selectInput() and 2 dateInput(). The 2 dateInput() should take as inputs the df$firstUpdated and df$lastUpdated respectively. Then the selectInput() should take as inputs the pollutants that exist in the df based on "TRUE"/"FALSE" value. And then the plots should be created. All of these should be triggered by clicking on the map.
Up to now I was not able to achieve this so in order to help you understand I connected the selectInput() and the dateInput() with input$loc which is a selectIpnut() with locations in the first tab as I will not need this when I find the solution.
library(shiny)
library(leaflet)
library(plotly)
library(shinythemes)
library(htmltools)
library(DT)
library(utilr)
library(openair)
library(plotly)
library(dplyr)
library(ggplot2)
library(gissr)
library(ropenaq)
library(worldmet)
# Define UI for application that draws a histogram
ui = navbarPage("ROPENAQ",
tabPanel("CREATE DATAFRAME",
sidebarLayout(
# Sidebar panel for inputs ----
sidebarPanel(
wellPanel(
uiOutput("loc"),
helpText("Choose a Location to create the dataframe.")
)
),
mainPanel(
)
)
),
tabPanel("LEAFLET MAP",
leafletOutput("map"),
wellPanel(
uiOutput("dt"),
uiOutput("dt2"),
helpText("Choose a start and end date for the dataframe creation. Select up to 2 dates")
),
"Select your Pollutant",
uiOutput("pollutant"),
helpText("While all pollutants are listed here, not all pollutants are measured at all locations and all times.
Results may not be available; this will be corrected in further revisions of the app. Please refer to the measurement availability
in the 'popup' on the map."),
hr(),
fluidRow(column(8, plotOutput("tim")),
column(4,plotOutput("polv"))),
hr(),
fluidRow(column(4, plotOutput("win")),
column(8,plotOutput("cal"))),
hr(),
fluidRow(column(12, plotOutput("ser"))
)
)
)
#server.r
# load data
# veh_data_full <- readRDS("veh_data_full.RDS")
# veh_data_time_var_type <- readRDS("veh_data_time_var_type.RDS")
df$location <- gsub( " " , "+" , df$location)
server = function(input, output, session) {
output$pollutant<-renderUI({
selectInput("pollutant", label = h4("Choose Pollutant"),
choices = colnames(df[,6:8]),
selected = 1)
})
#Stores the value of the pollutant selection to pass to openAQ request
###################################
#output$OALpollutant <- renderUI({OALpollutant})
##################################
# create the map, using dataframe 'locations' which is polled daily (using ropenaq)
#MOD TO CONSIDER: addd all available measurements to the popup - true/false for each pollutant, and dates of operation.
output$map <- renderLeaflet({
leaflet(subset(df,(df[,input$pollutant]=="TRUE")))%>% addTiles() %>%
addMarkers(lng = subset(df,(df[,input$pollutant]=="TRUE"))$longitude, lat = subset(df,(df[,input$pollutant]=="TRUE"))$latitude,
popup = paste("Location:", subset(df,(df[,input$pollutant]=="TRUE"))$location, "<br>",
"Pollutant:", input$pollutant, "<br>",
"First Update:", subset(df,(df[,input$pollutant]=="TRUE"))$firstUpdated, "<br>",
"Last Update:", subset(df,(df[,input$pollutant]=="TRUE"))$lastUpdated
))
})
#Process Tab
OAL_site <- reactive({
req(input$map_marker_click)
location %>%
filter(latitude == input$map_marker_click$lat,
longitude == input$map_marker_click$lng)
###########
#call Functions for data retrieval and processing. Might be best to put all data request
#functions into a seperate single function. Need to:
# call importNOAA() to retrieve meteorology data into temporary data frame
# call aq_measurements() to retrieve air quality into a temporary data frame
# merge meteorology and air quality datasets into one working dataset for computations; temporary
# meteorology and air quality datasets to be removed.
# call openAir() functions to create plots from merged file. Pass output to a dashboard to assemble
# into appealing output.
# produce output, either as direct download, or as an emailable PDF.
# delete all temporary files and reset for next run.
})
#fun
output$loc<-renderUI({
selectInput("loc", label = h4("Choose location"),
choices = df$location ,selected = 1
)
})
output$dt<-renderUI({
dateInput('date',
label = 'First Available Date',
value = subset(df$firstUpdated,(df[,1]==input$loc))
)
})
output$dt2<-renderUI({
dateInput('date2',
label = 'Last available Date',
value = subset(df$lastUpdated,(df[,1]==input$loc))
)
})
rt<-reactive({
AQ<- aq_measurements(location = input$loc, date_from = input$dt,date_to = input$dt2,parameter = input$pollutant)
met <- importNOAA(year = 2014:2018)
colnames(AQ)[9] <- "date"
merged<-merge(AQ, met, by="date")
# date output -- reports user-selected state & stop dates in UI
merged$location <- gsub( " " , "+" , merged$location)
merged
})
#DT
output$tim = renderPlot({
timeVariation(rt(), pollutant = "value")
})
}
shinyApp(ui = ui, server = server)
The part of my code that I believe input$MAPID_click should be applied is:
output$map <- renderLeaflet({
leaflet(subset(locations,(locations[,input$pollutant]=="TRUE")))%>% addTiles() %>%
addMarkers(lng = subset(locations,(locations[,input$pollutant]=="TRUE"))$longitude, lat = subset(locations,(locations[,input$pollutant]=="TRUE"))$latitude,
popup = paste("Location:", subset(locations,(locations[,input$pollutant]=="TRUE"))$location, "<br>",
"Pollutant:", input$pollutant, "<br>",
"First Update:", subset(locations,(locations[,input$pollutant]=="TRUE"))$firstUpdated, "<br>",
"Last Update:", subset(locations,(locations[,input$pollutant]=="TRUE"))$lastUpdated
))
})
output$dt<-renderUI({
dateInput('date',
label = 'First Available Date',
value = subset(locations$firstUpdated,(locations[,1]==input$loc))
)
})
output$dt2<-renderUI({
dateInput('date2',
label = 'Last available Date',
value = subset(locations$lastUpdated,(locations[,1]==input$loc))
)
})
rt<-reactive({
AQ<- aq_measurements(location = input$loc, date_from = input$dt,date_to = input$dt2)
met <- importNOAA(year = 2014:2018)
colnames(AQ)[9] <- "date"
merged<-merge(AQ, met, by="date")
# date output -- reports user-selected state & stop dates in UI
merged$location <- gsub( " " , "+" , merged$location)
merged
})
#DT
output$tim = renderPlot({
timeVariation(rt(), pollutant = "value")
})

Here is a minimal example. You click on your marker and you get a plot.
ui = fluidPage(
leafletOutput("map"),
textOutput("temp"),
plotOutput('tim')
)
#server.r
#df$location <- gsub( " " , "+" , df$location)
server = function(input, output, session) {
output$map <- renderLeaflet({
leaflet(df)%>% addTiles() %>% addMarkers(lng = longitude, lat = latitude)
})
output$temp <- renderPrint({
input$map_marker_click$lng
})
output$tim <- renderPlot({
temp <- df %>% filter(longitude == input$map_marker_click$lng)
# timeVariation(temp, pollutant = "value")
print(ggplot(data = temp, aes(longitude, latitude)) + geom_point())
})
}
shinyApp(ui = ui, server = server)

gdata.data.PhoneNumber: How do I get the type of Phone Number?

Using the class gdata.data.PhoneNumber, how do I get the type (Home/Business/Mobile/etc.) of that phone number?
This is the documentation I am referencing: https://gdata-python-client.googlecode.com/hg/pydocs/gdata.data.html#PhoneNumber

The "rel" attribute should be what you are looking for.
This is example code from https://github.com/google/gdata-python-client/blob/master/tests/gdata_tests/contacts/service_test.py:
# Create a new entry
new_entry = gdata.contacts.ContactEntry()
new_entry.title = atom.Title(text='Elizabeth Bennet')
new_entry.content = atom.Content(text='Test Notes')
new_entry.email.append(gdata.contacts.Email(
rel='http://schemas.google.com/g/2005#work',
primary='true',
address='liz#gmail.com'))
new_entry.phone_number.append(gdata.contacts.PhoneNumber(
rel='http://schemas.google.com/g/2005#work', text='(206)555-1212'))
new_entry.organization = gdata.contacts.Organization(
org_name=gdata.contacts.OrgName(text='TestCo.'),
rel='http://schemas.google.com/g/2005#work')
It doesn't access the "rel" attribute but it is there, I swear :)
Once you get a PhoneNumer instance you can print every attribute with the built-in dir() function:
print(dir(phone_number))
The following is a list of "rel"s (https://github.com/google/gdata-python-client/blob/master/src/gdata/data.py). I don't know whether all are applicable to phone numbers or not but it may be useful for checking the type:
FAX_REL = 'http://schemas.google.com/g/2005#fax'
HOME_REL = 'http://schemas.google.com/g/2005#home'
HOME_FAX_REL = 'http://schemas.google.com/g/2005#home_fax'
ISDN_REL = 'http://schemas.google.com/g/2005#isdn'
MAIN_REL = 'http://schemas.google.com/g/2005#main'
MOBILE_REL = 'http://schemas.google.com/g/2005#mobile'
OTHER_REL = 'http://schemas.google.com/g/2005#other'
OTHER_FAX_REL = 'http://schemas.google.com/g/2005#other_fax'
PAGER_REL = 'http://schemas.google.com/g/2005#pager'
RADIO_REL = 'http://schemas.google.com/g/2005#radio'
TELEX_REL = 'http://schemas.google.com/g/2005#telex'
TTL_TDD_REL = 'http://schemas.google.com/g/2005#tty_tdd'
WORK_REL = 'http://schemas.google.com/g/2005#work'
WORK_FAX_REL = 'http://schemas.google.com/g/2005#work_fax'
WORK_MOBILE_REL = 'http://schemas.google.com/g/2005#work_mobile'
WORK_PAGER_REL = 'http://schemas.google.com/g/2005#work_pager'
NETMEETING_REL = 'http://schemas.google.com/g/2005#netmeeting'
Those OTHER "rel"s can (or maybe should?) be joined with the object's "label" attribute.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Modifying a specific part of the name of a DT in R - datatable

You can use sub to replace HW_ with empty string, and replace .txt with empty string, effectively removing those parts from the names: names(data) <- sub('HW_', '', sub('\\.txt', '', names(data)))

Related

openpyxl convert scraped data in time format

R psych::statsBy() error: "'x' must be numeric"

How to update bokeh active interaction with GeoJSON as data source?

How to connect leaflet map clicks (events) with plot creation in a shiny app

gdata.data.PhoneNumber: How do I get the type of Phone Number?

Categories

Resources