Error creating LMDB database file in MATLAB for Caffe - image

I am trying to convert a dataset of images to LMDB format for use with Caffe, and I need to call the convert_imageset function for applying this conversion from inside Matlab.
I am using Linux, and I have created a shell (.sh) script with the needed parameters for running the conversion. Here is an example of how does my shell file look like:
GLOG_logtostderr=1 /usr/local/caffe-master2/build/tools/convert_imageset -resize_height=256 -resize_width=256 images_folder data_split/train.txt data_split/dataCNN_train_lmdb
When I simply run my script from the terminal like this:
./example_shell.sh
it works without any problem.
But when I try to do it from Matlab using the system() function:
system('./example_shell.sh')
it seems it is not able to open/find my files, rising the following error for each image in train.txt:
I0917 18:15:13.637830 8605 convert_imageset.cpp:82] A total of 68175 images.
I0917 18:15:13.638947 8605 db.cpp:34] Opened lmdb data_split/dataCNN_train_lmdb
E0917 18:15:13.639143 8605 io.cpp:77] Could not open or find file ...
E0917 18:15:13.639143 8605 io.cpp:77] Could not open or find file ...
E0917 18:15:13.639143 8605 io.cpp:77] Could not open or find file ...
Here are some sample lines from train.txt file (do not mind the 0s, they are just dummy labels):
/media/user/HDD_2TB/Food_101_Dataset/images/beef_carpaccio/970563.jpg 0
/media/user/HDD_2TB/Food_101_Dataset/images/chocolate_mousse/1908117.jpg 0
/media/user/HDD_2TB/Food_101_Dataset/images/cup_cakes/632892.jpg 0
/media/user/HDD_2TB/Food_101_Dataset/images/garlic_bread/1498092.jpg 0
/media/user/HDD_2TB/Food_101_Dataset/images/ceviche/3115634.jpg 0
They are absolute paths, so there should be no problem.
Any idea you might have about what could be happening can be very helpful for me!
Thank you,
Marc

I have not been able to solve the specific problem with Matlab, but I have managed to do the following (weird) workaround by using .txt files for communication:
Call the main Matlab's program from Python.
Check whenever Matlab needs to call the ./example_shell.sh script.
Python does the conversion calling ./example_shell.sh.
Matlab execution continues.

Related

Upload to HDFS stops with warning "Slow ReadProcessor read"

When I try to upload files that are about 20 GB into HDFS they usually upload till about 12-14 GB then they stop uploading and I get a bunch of these warnings through command line
"INFO hdfs.DataStreamer: Slow ReadProcessor read fields for block BP-222805046-10.66.4.100-1587360338928:blk_1073743783_2960 took 62414ms (threshold=30000ms); ack: seqno: 226662 reply: SUCCESS downstreamAckTimeNanos: 0 flag: 0, targets:"
However, if I try to upload the files like 5-6 times, they sometimes work after the 4th or 5th attempt. I believe if I alter some data node storage settings I can achieve consistent uploads without issues but I don't know what parameters to modify in the hadoop configurations. Thanks!
Edit: This happens when I put the file into HDFS through python program which uses a subprocess call to put the file in. However, even if I directly call it from command line I still run into the same issue.

what is the file ORA_DUMMY_FILE.f in oracle?

oracle version: 12.2.0.1
As you know, these are then unix processes for the parallel servers in oracle:
ora_p000_ora12c
ora_p001_ora12c
....
ora_p???_ora12c
They can be seen also with the view: gv$px_process.
The spid for each parallel server can be obtained from there.
Then I look for the open files associated with te parallel server here:
ls -l /proc/<spid>/fd
And I'm obtaining around 500-10000 file descriptors for several parallel servers equal to this one:
991 -> /u01/app/oracle/admin/ora12c/dpdump/676185682F2D4EA0E0530100007FFF5E/ORA_DUMMY_FILE.f (deleted)
I've deleted them using:(actually I've create a small script for doing it because there are thousands of them)
gdb -p <spid>
gdb> p close(<fd_id>)
But after some hours the file descriptors start being created again (hundreds every day)
If they are not deleted then eventually the linux limit is reached and any parallel query throws an error like this:
ORA-12801: error signaled in parallel query server P001
ORA-01116: error in opening database file 132
ORA-01110: data file 132: '/u02/oradata/ora12c/pdbname/tablespacenaname_ts_1.dbf'
ORA-27077: too many files open
Does anyone have any idea of how and why this file descriptors are being created, and how to avoid it?.
Edited: Added some more information that could be useful.
I've tested that when a new PDB is created a directory DATA_PUMP_DIR is created in it (select * from all_directories) that is pointing to:
/u01/app/oracle/admin/ora12c/dpdump/<xxxxxxxxxxxxx>
The linux directory is also created.
Also one file descriptor is created pointing to ORA_DUMMY_FILE.f in the new dpdump subdirectory like the ones described initially
lsof | grep "ORA_DUMMY_FILE.f (deleted)"
/u01/app/oracle/admin/ora12c/dpdump/<xxxxxxxxxxxxx>/ORA_DUMMY_FILE.f (deleted)
This may be ok, the problem I face is the continuos growing of the file descriptors pointing to ORA_DUMMY_FILE that reach the linux limits.

MapReduceIndexerTool output dir error "Cannot write parent of file"

I want to use Cloudera's MapReduceIndexerTool to understand how morphlines work. I created a basic morphline that just reads lines from the input file and I tried to run that tool using that command:
hadoop jar /opt/cloudera/parcels/CDH/lib/solr/contrib/mr/search-mr-*-job.jar org.apache.solr.hadoop.MapReduceIndexerTool \
--morphline-file morphline.conf \
--output-dir hdfs:///hostname/dir/ \
--dry-run true
Hadoop is installed on the same machine where I run this command.
The error I'm getting is the following:
net.sourceforge.argparse4j.inf.ArgumentParserException: Cannot write parent of file: hdfs:/hostname/dir
at org.apache.solr.hadoop.PathArgumentType.verifyCanWriteParent(PathArgumentType.java:200)
The /dir directory has 777 permissions on it, so it is definitely allowed to write into it. I don't know what I should do to allow it to write into that output directory.
I'm new to HDFS and I don't know how I should approach this problem. Logs don't offer me any info about that.
What I tried until now (with no result):
created a hierarchy of 2 directories (/dir/dir2) and put 777 permissions on both of them
changed the output-dir schema from hdfs:///... to hdfs://... because all the examples in the --help menu are built that way, but this leads to an invalid schema error
Thank you.
It states 'cannot write parent of file'. And the parent in your case is /. Take a look into the source:
private void verifyCanWriteParent(ArgumentParser parser, Path file) throws ArgumentParserException, IOException {
Path parent = file.getParent();
if (parent == null || !fs.exists(parent) || !fs.getFileStatus(parent).getPermission().getUserAction().implies(FsAction.WRITE)) {
throw new ArgumentParserException("Cannot write parent of file: " + file, parser);
}
}
In the message printed is file, in your case hdfs:/hostname/dir, so file.getParent() will be /.
Additionally you can try the permissions with hadoop fs command, for example you can try to create a zero length file in the path:
hadoop fs -touchz /test-file
I solved that problem after days of working on it.
The problem is with that line --output-dir hdfs:///hostname/dir/.
First of all, there are not 3 slashes at the beginning as I put in my continuous trying to make this work, there are only 2 (as in any valid HDFS URI). Actually I put 3 slashes because otherwise, the tool throws an invalid schema exception! You can easily see in this code that the schema check is done before the verifyCanWriteParent check.
I tried to get the hostname by simply running the hostname command on the Cent OS machine that I was running the tool on. This was the main issue. I analyzed the /etc/hosts file and I saw that there are 2 hostnames for the same local IP. I took the second one and it worked. (I also attached the port to the hostname, so the final format is the following: --output-dir hdfs://correct_hostname:8020/path/to/file/from/hdfs
This error is very confusing because everywhere you look for the namenode hostname, you will see the same thing that the hostname command returns. Moreover, the errors are not structured in a way that you can diagnose the problem and take a logical path to solve it.
Additional information regarding this tool and debugging it
If you want to see the actual code that runs behind it, check the cloudera version that you are running and select the same branch on the official repository. The master is not up to date.
If you want to just run this tool to play with the morphline (by using the --dry-run option) without connecting to Solr and playing with it, you can't. You have to specify a Zookeeper endpoint and a Solr collection or a solr config directory, which involves additional work to research on. This is something that can be improved to this tool.
You don't need to run the tool with -u hdfs, it works with a regular user.

Send data by network and plot with octave

I am working on a robot and my goal is to plot the state of the robot.
For now, my workflow is this:
Launch the program
Redirect the output in a file (robot/bash): rosrun explo explo_node > states.txt
Send the file to my local machine (robot/bash): scp states.txt my_desktop:/home/user
Plot the states with octave (desktop/octave): plot_data('states.txt')
Is there a simple solution to have the data in "real time"? For the octave side. I think that I can with not so much difficulty read from a file as an input and plot the data when data is added.
The problem is how do I send the data to a file?
I am opened to other solutions than octave. The thing is that I need to have 2d plot with arrows for the orientation of the robot.
Here's an example of how you could send the data over the network (as Andy suggested) and plot as it is generated (i.e. realtime). I also think this approach is the most flexible / appropriate.
To demonstrate, I will use a bash script that generates an
pair every 10th of a second, for the
function, in the range
:
#!/bin/bash
# script: sin.sh
for i in `seq 0 0.01 31.4`;
do
printf "$i, `echo "s($i)" | bc -l`\n"
sleep 0.1
done
(Don't forget to make this script executable!)
Prepare the following octave script (requires the sockets package!):
% in visualiseRobotData.m
pkg load sockets
s = socket();
bind(s, 9000);
listen(s, 1);
c = accept(s);
figure; hold on;
while ! isempty (a = str2num (char (recv (c, inf))))
plot (a(:,1), a(:,2), '*'); drawnow;
end
hold off;
Now execute things in the following order:
Run the visualiseRobotData script from the octave terminal.
(Note: this will block until a connection is established)
From your bash terminal run: ./sin.sh | nc localhost 9000
And watch the datapoints get plotted as they come in from your sin.sh script.
It's a bit crude, but you can just reload the file in a loop. This one runs for 5 minutes:
for i = 1:300
load Test/sine.txt
plot (sine(:,1), sine(:,2))
sleep (1)
endfor
You can mount remote directory via sshfs:
sshfs user#remote:/path/to/remote_dir local_dir
so you wouldn't have to load remote file. If sshfs is not installed, install it. To unmount remote directory later, execute
fusermount -u local_dir
To get a robot's data from Octave, execute (Octave code)
system("ssh user#host 'cd remote_dir; rosrun explo explo_node > states.txt'")
%% then plot picture from the data in local_dir
%% that is defacto the directory on the remote server

Code shows error in cluster but works fine otherwise

Hello everyone,
I have a bash file which has the following code:
./lda --num_topics 15 --alpha 0.1 --beta 0.01 --training_data_file testdata/test_data.txt --model_file Model_Files/lda_model_t15.txt --burn_in_iterations 120 --total_iterations 150
This works perfectly fine normally but when I run it in a cluster it is not loading the data that it is supposed to load from the connected .cc files. I have given #!/bin/bash in the header. What can I do to rectify this situation? Please help!
You will need to mention the full path to the lda executable. Since it's not invoked by you manually, the system will not know where to find the executable if invoked by the shell. Since this is not a shell command, you don't necessarily need the #!/bin/bash even.
/<FullPath>/lda --num_topics 15 --alpha 0.1 --beta 0.01 --training_data_file testdata/test_data.txt --model_file Model_Files/lda_model_t15.txt --burn_in_iterations 120 --total_iterations 150

Resources