I have several maps that I am working with. I want to extract the values (1, 0 and NA) from the maps and place them all into a summary matrix. Since I have so many maps, I think its best to do this as a for loop. This is the code I have so far and my maps and empty summary matrix are uploaded to my Dropbox here: DATASET here
setwd ('C:/Users/Israel/Dropbox/')
require (raster)
require(rgdal)
require (plyr)
#load in the emxpy matrix to be filled
range.summary<-read.csv('range_sizes.csv', header=T)
#load in maps and count pixels
G1.total<-raster('Group1/Summary/PA_current_G1.tif')
G1.total.df<-as.data.frame(G1.total)
#these are the values I need to be placed into the empty matrix (range.summary)
count (G1.total.df)
PA_current_G1 freq
1 0 227193
2 1 136871
3 NA 561188
Try this
I downloaded 3 images
library(raster)
wd <- 'D:\\Programacao\\R\\Stackoverflow\\raster'
allfiles <- list.files(file.path(wd), all.files = F)
# List of TIF files at dir.fun folder
tifs <- grep(".tif$", allfiles, ignore.case = TRUE, value = TRUE)
#stack rasterLayer
mystack <- stack(file.path(wd, tifs))
# calculate frequencies
freqs <- freq(mystack, useNA='ifany')
# rbind list to get a data.frame
freqsdf <- do.call(rbind.data.frame, freqs)
freqsdf
value count
PA_2050_26_G1.1 0 256157
PA_2050_26_G1.2 1 193942
PA_2050_26_G1.3 NA 475153
PA_2050_26_G2.1 0 350928
PA_2050_26_G2.2 1 99171
PA_2050_26_G2.3 NA 475153
PA_2050_26_sub_G1.1 0 112528
PA_2050_26_sub_G1.2 1 90800
PA_2050_26_sub_G1.3 NA 721924
str(freqsdf)
'data.frame': 9 obs. of 2 variables:
$ value: num 0 1 NA 0 1 NA 0 1 NA
$ count: num 256157 193942 475153 350928 99171 ...
Now it is a matter of work the output shape.
Related
I am so newbie and thank you so much in advance for advice
I want to make co-occurrence matrix, and followed link below
How to use R to create a word co-occurrence matrix
but I cannot understand why value of A-A is 10 in the matirx below
It should be 4 isn't it? because there are four A
dat <- read.table(text='film tag1 tag2 tag3
1 A A A
2 A C F
3 B D C ', header=T)
crossprod(as.matrix(mtabulate(as.data.frame(t(dat[, -1])))))
( ) A C F B D
A 10 1 1 0 0
C 1 2 1 1 1
F 1 1 1 0 0
B 0 1 0 1 1
D 0 1 0 1 1
The solution you use presumes each tag appears only once per film, which jives with the definition of a co-occurrence matrix as far as I can tell. Therefore, each A on the first line gets counted as co-occurring with itself and with the other two As, resulting in a total of ten co-occurences when factoring in the A on the second line.
My data.txt file contains the 2D coordinates of points forming a segment of a polygon. These coordinates are evolving over time. The file is structured like this:
itr nbr_pts p1.x p1.y ...... pk.x pk.y
(itr+1) ..........
.....
where pk is the k-th point/vertex of the polygon and nb_pts is the number vertices.
My question is how to draw the 2D polygon from its vertices (p1, p2, ...pk) at a certain iteration (row)?
In addition, note that there is not only one data file/polygon but N ones: data1.txt .... dataN.txt
I tried something like this but did not work (Nbr of files =6)
N = 6
set multiplot
plot for [i=0:N-1] polygon_i = sprintf("%s/data%d.dat",filename, i) polygon_i val=$2 for [j=1:$2] u (j+1):(j+1+1) w lines
I know how many polygones/files there is (6 in this cae), but I have no prior knowledge on the number of columns in each file; the number of vertices can vary from a polygone to another.
Any idea please?
The idea I have would need a modification in the structure of your files. For each iteration time, there is a block containing the x and y coordinates of the polygon's vertices:
# file: data1.txt
# itr 0
0 0
1 1
1 2
0 0
# itr 1
1 3
2 1
0 1
1 2
1 3
# itr 2
3 1
2 1
0 0
3 1
Notice that each block is separated by two empty lines. For iteration 0 (block 0 or itr 0) there is a polygon with three vertices, itr 1 has four vertices, and itr 2 has three vertices. To obtain a closed curve, it is needed to specify the end point, for example, for itr 1 I put the point 1 3 twice.
For this file, we can plot the polygon at iteration iter as
iter = 1 # select block 1, or itr 1
plot "data1.txt" index iter w lp ps 2 pt 7
If you have several files, then try
# option 1
nbr = 6 # number of files
iter = 1 # select block 1, or itr 1
plot for [i=1:nbr] "data".i.".txt" index iter w lp ps 2 pt 7 title "".i
#option 2
files = system("ls data*.txt") # get all datafiles in folder
iter = 1 # select block 1, or itr 1
plot for [data in files] data index iter w lp ps 2 pt 7 title data
I have the following array:
AA = zeros(5,3);
AA(1,3)=1;
AA(3,3)=1;
AA(4,2)=1;
and I want to place the value one in the collumns defined by the following
vector a = [0; 2; 0; 0; 1]. Each value of this vector refers to the collumn
index that we want to change in each row. When zero apears no changes should be made.
Desired output:
0 0 1
0 1 0
0 0 1
0 1 0
1 0 0
Could you please suggest a way to do this without for loop? The goal is
a faster execution.
Thanks!!!
Approach 1
nrows = size(AA,1) %// Get the no. of rows, as we would use this parameter later on
%// Calculate the linear indices with `a` as the column indices and
%// [1:nrows] as the row indices
idx = (a-1)*nrows+[1:nrows]' %//'
%// Select the valid linear indices (ones that have the corresponding a as non-zeros
%// and use them to index into AA and set those as 1's
AA(idx(a~=0))=1
Code output with given AA -
>> AA
AA =
0 0 1
0 1 0
0 0 1
0 1 0
1 0 0
Approach 2
AA(sub2ind(size(AA),find(a~=0),a(a~=0)))=1
Breaking it down to few steps for explanation:
find(a~=0) and a(a~=0) gets us the VALID row and columns indices respectively as needed for sub2ind(size(),row,column) format.
sub2ind gets us the linear indices, which we can use to index into input matrix AA and set those in AA as 1's.
Suppose you have this program
Subroutine readDIM which reads the dimensions (rows, columns) of a matrix from a txt file. (In order to simplify, let it be an INTEGER). ReadDIM works using tokens and it works fine by assumption.
A text file containing for example:
1 2 3 4
1 2 20 5
3 0 333 3
Returns nrow = 3, ncol = 4
Since readDIM has given the true dimensions of the matrix, I want to allocate space to:
REAL, DIMENSION (:,:), ALLOCATABLE :: vMatrix
To read the matrix from a txt file and to store it into the 2d-array. So I've written the following
SUBROUTINE buildVMatrix
OPEN(UNIT=1, FILE = filename, STATUS ='OLD',IOSTAT=ios);
ALLOCATE(vMatrix(nrow,ncol));
WRITE(*,*) "Register matrix from file:", filename;
WRITE(*,*) "-------------------------------------------------------";
DO i = 1, UBOUND(vMatrix,1)
READ(1,*, IOSTAT = ios) (vMatrix(i,j),j=1,UBOUND(vMatrix,2));
!IF(ios /= 0 ) EXIT
END DO
CLOSE(1)
END SUBROUTINE
When I print vMatrix the output is:
matrix.txt : 1 2 3 4 buildVMatrix output (once printed) 1 2 3 4
1 2 20 5 1 2 20 5
3 0 333 3 3 0 333 0
It doesn't read the last number. I know it's caused by the DO loop inside buildVMatrix, but can't explain myself this and have no idea how to fix it writing a different code.
It's because there's no line ending at the last line in your txt file, try to type a return after the last number.
Can you make this R code faster? Can't see how to vectorize it.
I have a data-frame as follows (sample rows below):
> str(tt)
'data.frame': 1008142 obs. of 4 variables:
$ customer_id: int, visit_date : Date, format: "2010-04-04", ...
I want to compute the diff between visit_dates for a customer.
So I do diff(tt$visit_date), but have to enforce a discontinuity (NA) everywhere customer_id changes and the diff is meaningless, e.g. row 74 below.
The code at bottom does this, but takes >15 min on the 1M row dataset.
I also tried piecewise computing and cbind'ing the subresult per customer_id (using which()), that was also slow.
Any suggestions? Thanks. I did search SO, R-intro, R manpages, etc.
customer_id visit_date visit_spend ivi
72 40 2011-03-15 18.38 5
73 40 2011-03-20 23.45 5
74 79 2010-04-07 150.87 NA
75 79 2010-04-17 101.90 10
76 79 2010-05-02 111.90 15
Code:
all_tt_cids <- unique(tt$customer_id)
# Append ivi (Intervisit interval) column
tt$ivi <- c(NA,diff(tt$visit_date))
for (cid in all_tt_cids) {
# ivi has a discontinuity when customer_id changes
tt$ivi[min(which(tt$customer_id==cid))] <- NA
}
(Wondering if we can create a logical index where customer_id differs to the row above?)
to set NA to appropriate places, you again can use diff() and one-line trick:
> tt$ivi[c(1,diff(tt$customer_id)) != 0] <- NA
explanation
let's take some vector x
x <- c(1,1,1,1,2,2,2,4,4,4,5,3,3,3)
we want to extract such indexes, which start with new number, i.e. (0,5,8,11,12). We can use diff() for that.
y <- c(1,diff(x))
# y = 1 0 0 0 1 0 0 2 0 0 1 -2 0 0
and take those indexes, that are not equal to zero:
x[y!=0] <- NA