Import an *.xls file in R? - windows

I am struggeling to read an *.xls file into R:
I did the following:
I set my working directory to the *.xls file and then:
> library(gdata) # load the gdata package
> mydata = read.xls("comprice.xls", sheet=1, verbose=FALSE)
Mistake in findPerl(verbose = verbose) : perl executable not found. Use perl= argument to specify the correct path. mistake in file.exists(tfn) : unknown 'file' argument
However, my path is correct and there is the file! Whats wrong?
UPDATE
I have installed it already, however now I get: Exception: cannot find function "read.xls"...

This error message means that perl is not installed on your computer or it is not set on your path.
If the perl is installed then you can put argument perl= inside read.xls() function.
read.xls(xlsfile, perl="C:/perl/bin/perl.exe")

As an alternative, you could try xlsxpackage:
read.xlsx("comprice.xls", 1) reads your file and makes the data.frame column classes nearly useful, but is very slow for large data sets.
read.xlsx2("comprice.xls", 1) is faster, but you'll have to define column classes manually. If you run the command twice, you will not need to count columns so much:
data <- read.xlsx2("comprice.xls", 1)
data <- read.xlsx2("comprice.xls", 1, colClasses= rep("numeric", ncol(data)))

Perl is either not installed or cannot be found. You can either install it, or specify the path where it is installed using
perl='path of perl installation'
in the call.

Related

Unable to load/require file from Lua running from Atom in Windows

I'm trying to use Atom to run a Lua script. However, when I try to load files via the require() command, it always says it's unable to locate them. The files are all in the same folder. For example, to load utils.lua I have tried
require 'utils'
require 'utils.lua'
require 'D:\Users\Mike\Dropbox\Lua Modeling\utils.lua'
require 'D:\\Users\\Mike\\Dropbox\\Lua Modeling\\utils.lua'
require 'D:/Users/Mike/Dropbox/Lua Modeling/utils.lua'
I get errors like
Lua: D:\Users\Mike\Dropbox\Lua Modeling\main.lua:12: module 'D:\Users\Mike\Dropbox\Lua Modeling\utils.lua' not found:
no field package.preload['D:\Users\Mike\Dropbox\Lua Modeling\utils.lua']
no file '.\D:\Users\Mike\Dropbox\Lua Modeling\utils\lua.lua'
no file 'D:\Program Files (x86)\Lua\5.1\lua\D:\Users\Mike\Dropbox\Lua Modeling\utils\lua.lua'
no file 'D:\Program Files (x86)\Lua\5.1\lua\D:\Users\Mike\Dropbox\Lua Modeling\utils\lua\init.lua'
no file 'D:\Program Files (x86)\Lua\5.1\D:\Users\Mike\Dropbox\Lua Modeling\utils\lua.lua'
The messages says on the first line that 'D:\Users\Mike\Dropbox\Lua Modeling\utils.lua' was not found, even though that is the full path of the file. What am I doing wrong?
Thanks.
The short answer
You should be able to load utils.lua by using the following code:
require("utils")
And by starting your program from the directory that utils.lua is in:
cd "D:\Users\Mike\Dropbox\Lua Modeling"
lua main.lua
The long answer
To understand what is going wrong here, it is helpful to know a little bit about how require works. The first thing that require does is to search for the module in the module path. From Programming in Lua chapter 8.1:
The path used by require is a little different from typical paths. Most programs use paths as a list of directories wherein to search for a given file. However, ANSI C (the abstract platform where Lua runs) does not have the concept of directories. Therefore, the path used by require is a list of patterns, each of them specifying an alternative way to transform a virtual file name (the argument to require) into a real file name. More specifically, each component in the path is a file name containing optional interrogation marks. For each component, require replaces each ? by the virtual file name and checks whether there is a file with that name; if not, it goes to the next component. The components in a path are separated by semicolons (a character seldom used for file names in most operating systems). For instance, if the path is
?;?.lua;c:\windows\?;/usr/local/lua/?/?.lua
then the call require"lili" will try to open the following files:
lili
lili.lua
c:\windows\lili
/usr/local/lua/lili/lili.lua
Judging from your error message, your Lua path seems to be the following:
.\?.lua;D:\Program Files (x86)\Lua\5.1\lua\?.lua;D:\Program Files (x86)\Lua\5.1\lua\?\init.lua;D:\Program Files (x86)\Lua\5.1\?.lua
To make that easier to read, here are each the patterns separated by line breaks:
.\?.lua
D:\Program Files (x86)\Lua\5.1\lua\?.lua
D:\Program Files (x86)\Lua\5.1\lua\?\init.lua
D:\Program Files (x86)\Lua\5.1\?.lua
From this list you can see that when calling require
Lua fills in the .lua extension for you
Lua fills in the rest of the file path for you
In other words, you should just specify the module name, like this:
require("utils")
Now, Lua also needs to know where the utils.lua file is. The easiest way is to run your program from the D:\Users\Mike\Dropbox\Lua Modeling folder. This means that when you run require("utils"), Lua will expand the first pattern .\?.lua into .\utils.lua, and when it checks that path it will find the utils.lua file in the current directory.
In other words, running your program like this should work:
cd "D:\Users\Mike\Dropbox\Lua Modeling"
lua main.lua
An alternative
If you can't (or don't want to) change your working directory to run the program, you can use the LUA_PATH environment variable to add new patterns to the path that require uses to search for modules.
set LUA_PATH=D:\Users\Mike\Dropbox\Lua Modeling\?.lua;%LUA_PATH%;
lua "D:\Users\Mike\Dropbox\Lua Modeling\main.lua"
There is a slight trick to this. If the LUA_PATH environment variable already exists, then this will add your project's folder to the start of it. If LUA_PATH doesn't exist, this will add ;; to the end, which Lua fills in with the default path.

Perl code doesn't run in a bash script with scheduling of crontab

I want to schedule my Perl code to be run every day at a specific time. so I put the below code in bash file:
Automate.sh
#!/bin/sh
perl /tmp/Taps/perl.pl
The schedule has been specified in below path:
10 17 * * * sh /tmp/Taps/Automate.sh > /tmp/Taps/result.log
When the time arrived to 17:10 the .sh file hasn't been running. however, when I run ./Automate.sh (manually) it is running and I see the result. I don't know what is the problem.
Perl Code
#!/usr/bin/perl -w
use strict;
use warnings;
use Data::Dumper;
use XML::Dumper;
use TAP3::Tap3edit;
$Data::Dumper::Indent=1;
$Data::Dumper::Useqq=1;
my $dump = new XML::Dumper;
use File::Basename;
my $perl='';
my $xml='';
my $tap3 = TAP3::Tap3edit->new();
foreach my $file(glob '/tmp/Taps/X*')
{
$files= basename($file);
$tap3->decode($files) || die $tap3->error;
}
my $filename=$files.".xml\n";
$perl = $tap3->structure;
$dump->pl2xml($perl, $filename);
print "Done \n";
error:
No such file or directory for file X94 at /tmp/Taps/perl.pl line 22.
X94.xml
foreach my $file(glob 'Taps/X*') -- when you're running from cron, your current directory is /. You'll want to provide the full path to that Taps directory. Also specify the output directory for Out.xml
Cron uses a minimal environment and a short $PATH, which may not necessarily include the expected path to perl. Try specifying this path fully. Or source your shell settings before running the script.
There are a lot of things that can go wrong here. The most obvious and certain one is that if you use a glob to find the file in directory "Taps", then remove the directory from the file name by using basename, then Perl cannot find the file. Not quite sure what you are trying to achieve there. The file names from the glob will be for example Taps/Xfoo, a relative path to the working directory. If you try to access Xfoo from the working directory, that file will not be found (or the wrong file will be found).
This should also (probably) lead to a fatal error, which should be reported in your error log. (Assuming that the decode function returns a false value upon error, which is not certain.) If no errors are reported in your error log, that is a sign the program does not run at all. Or it could be that decode does not return false on missing file, and the file is considered to be empty.
I assume that when you test the program, you cd to /tmp and run it, or your "Taps" directory is in your home directory. So you are making assumptions about where your program looks for the files. You should be certain where it looks for files, probably by using only absolute paths.
Another simple error might be that crontab does not have permission to execute the file, or no read access to "Taps".
Edit:
Other complications in your code:
You include Data::Dumper, but never actually use that module.
$xml variable is not used.
$files variable not declared (this code would never run with use strict)
Your $files variable is outside your foreach loop, which means it will only run once. Since you use glob I assumed you were reading more than one file, in which case this solution will probably not do what you want. It is also possible that you are using a glob because the file name can change, e.g. X93, X94, etc. In that case you will read the last file name returned by the glob. But this looks like a weak link in your logic.
You add a newline \n to a file name, which is strange.

sql loader without .dat extension

Oracle's sqlldr defaults to a .dat extension. That I want to override. I don't like to rename the file. When googled get to know few answers to use . like data='fileName.' which is not working. Share your ideas, please.
Error message is fileName.dat is not found.
Sqlloder has default extension for all input files data,log,control...
data= .dat
log= .log
control = .ctl
bad =.bad
PARFILE = .par
But you have to pass filename without apostrophe and dot
sqlloder pass/user#db control=control data=data
sqloader will add extension. control.ctl data.dat
Nevertheless i do not understand why you do not want to specify extension?
You can't, at least in Unix/Linux environments. In Windows you can use the trailing period trick, specifying either INFILE 'filename.' in the control file or DATA=filename. on the command line. WIndows file name handling allows that; you can for instance do DIR filename. at a command prompt and it will list the file with no extension (as will DIR filename). But you can't do that with *nix, from a shell prompt or anywhere else.
You said you don't want to copy or rename the file. Temporarily renaming it might be the simplest solution, but as you may have a reason not to do that even briefly you could instead create a hard or soft link to the file which does have an extension, and use that link as the target instead. You could wrap that in a shell script that takes the file name argument:
# set variable from correct positional parameter; if you pass in the control
# file name or other options, this might not be $1 so adjust as needed
# if the tmeproary file won't be int he same directory, need to be full path
filename=$1
# optionally check file exists, is readable, etc. but overkill for demo
# can also check temporary file does not already exist - stop or remove
# create soft link somewhere it won't impact any other processes
ln -s ${filename} /tmp/${filename##*/}.dat
# run SQL*Loader with soft link as target
sqlldr user/password#db control=file.ctl data=/tmp/${filename##*/}.dat
# clean up
rm -f /tmp/${filename##*/}.dat
You can then call that as:
./scriptfile.sh /path/to/filename
If you can create the link in the same directory then you only need to pass the file, but if it's somewhere else - which may be necessary depending on why renaming isn't an option, and desirable either way - then you need to pass the full path of the data file so the link works. (If the temporary file will be int he same filesystem you could use a hard link, and you wouldn't have to pass the full path then either, but it's still cleaner to do so).
As you haven't shown your current command line options you may have to adjust that to take into account anything else you currently specify there rather than in the control file, particularly which positional argument is actually the data file path.
I have the same issue. I get a monthly download of reference data used in medical application and the 485 downloaded files don't have file extensions (#2gb). Unless I can load without file extensions I have to copy the files with .dat and load from there.

How to use im2rec in MXnet to create my own dataset

In windows 10, I followed the step-by-step MXnet tutorial to use im2rec.py to create a dataset. I created a image list file like this:
integer_image_index \t label_index \t path_to_image
Next, I modified .txt to .lst.
Finally, I executed the command:
python im2rec.py --exts '.jpg' --train-ratio 0.41 --test-ratio 0.49 --recursive=True --pack-label=True D:\CUB_200_2011\data\image_label.lst D:\CUB_200_2011\CUB_200_2011\image
It is shown that "read no error", but the files created by the command like .lst and .rec are 0K, there is empty. I don't know why.
Please tell me what mistakes I made.
im2rec.py will print
read none error:(filename)
for any file that it can't load for whatever reason. Maybe some of the files you list aren't there or are empty? Or maybe the base path you've specified is wrong -- I notice you have the folder name CUB_200_2011 twice.

Reinstalling packages from a list generated by command: ado dir

I am recovering Stata following a Windows upgrade. I have a list of my packages generated from ado dir in the following format:
[1] package mdesc from http://fmwww.bc.edu/RePEc/bocode/m
'MDESC': module to tabulate prevalence of missing values
[2] package univar from http://fmwww.bc.edu/RePEc/bocode/u
'UNIVAR': module to generate univariate summary with box-and-whiskers plot
[3] package tabmiss from http://www.ats.ucla.edu/stat/stata/ado/analysis
tabmiss. Shows tabulation of number of missing and non-missing values
I have many packages and would like to reinstall them without having to designate each directory/url via net cd. While using net cd along with net install or ssc install along with package names in a loop is trivial (as below), it would seem that an automated method for this task might be available.
net cd http://www.ats.ucla.edu/stat/stata/ado/analysis
local ucla tabmiss csgof powerlog ldfbeta
foreach x of local ucla {
net install `x'
}
To my knowledge, there is no built-in or automated method of tracking and managing your installed packages outside of what is available through ado or net.
I would also tend to agree with #Nick Cox that this task seems strange and I can't imagine how a new Stata install or reinstall could know what was installed previously, but I find the question interesting for other reasons.
The main reason being for users who have Stata installed on multiple machines who need the same packages on both machines. I faced a similar issue when I purchased a new computer and installed Stata but wanted all of the packages I use to be available as well. Outside of moving the ado directory or selected contents I'm not aware of any quick solution.
Here it would be possible to use the output of ado dir on one machine to determine what you need to install on a second machine with a new Stata install.
The method you propose using a foreach loop could save you time from having to type in or copy/paste a lot of packages and URLs. At the same time however, this is only beneficial if you have many packages from only a few repositories because you will need to net cd to the URL each time as you show in your example.
An alternative solution is the programmatic solution. As you know, ado dir will list each installed package, the URL and a short description of the package. Using this, a log file, and the built in I/O functionality, a short program could be written to automate the process and dynamically build a do file that contains the commands to install the already installed packages.
The code below generates a do file containing commands (in this case, net describe package, from(url)) for each package I have installed on my computer.
clear *
tempfile log1
log using "`log1'", text name(mylog)
ado dir
log close mylog
tempname logfile
file open `logfile' using "`log1'", read
file read `logfile' line
file open dfh using "path/to/your/dofile.do", write replace
local pckage "package"
while r(eof) == 0 {
if `: list pckage in line' {
local packageName : word 3 of `line'
local dirName : word 5 of `line'
di "`packageName' `dirName'"
file write dfh "net describe `packageName', from(`dirName')"
file write dfh _newline
}
file read `logfile' line
}
file close `logfile'
file close dfh
In the above code, I create a temp file to write a .txt log file to and store the contents of ado dir in that file.
Then, I open the log file using file open and read it line by line in the while loop.
Above the loop, I'm creating a do file at /path/to/your/dofile.do to hold the output of the loop - the dynamically created commands relating to the installed packages on my machine.
The loop will iterate so long as r(eof) = 0, where r(eof) is an end of file marker. I use an if statement to sort out lines of the log file which contain the word package, as I'm only interested in those lines with the package name and URL in them.
Inside of the if block, I parse the local macro line to pull the package name and the URL/directory name.
this is important: this section of code assumes that the 3rd and 5th words in the macro will always be the package name and URL respectively - Confirm this from the output of ado dir before executing.
You will also need to change the command that is being written to the file handle dfh inside of the loop to what you want (net install, etc) when you are ready to execute.
For more help on using file, locals, and tempfiles execute any of the following in Stata:
help file
help extended_fcn
help macrolists
There may be nicer ways to parse the contents of ado dir but this has worked for me. And of course I'd always advise that you take the time to understand what the code is doing so that you can make any necessary tweaks to fit your particular situation.

Resources