I'm new to script writing. So I wanted to know if it's possible to write a script that uses another software in the process. For example, I need to run a script to process a directory of images using an OCR software and store the output in another file.
It is possible to do so? And if so, how can I call the software in the script? For instance, if I want to use Adobe's scan to pdf function or ABBYY text grabber?
Related
If one has a pdf file created by a third part using software product X, would it be easier to perform file manipulation, editing and text extraction via that same software product X? that is, is there an advantage to using the same software as was used to created the pdf file by the third party?
I am currently developing a web crawler which retrieves data from a specific website and parse it to a xlsx file. I'd like to know if it is possible to create a .exe file where, on windows, you'd only have to execute it to create the webspider, retrieve the data...etc
Thanks
(I'm developing on linux but people using it will be using it on Windows, that's why I'm looking for an easy way for them to use this program)
I would like to write a small program, or script, to extract a set of pictures from a pdf.
I have several PDFs, they each have a table of pictures. I would link to have one picture per file. Therefore I need a way to extract them. Due to the nature of the PDF (A table/grid), it seems that it would be much easier to write a program, than do some manual method. However I have no idea what tools are available.
What libraries are available?
Preference Python, then C# or Java, then maybe some other language (My C and C++ is rusty, I have not done them for years).
I am on Debian Gnu/Linux, so have a wide choice of tools.
I went with pdfbox (an Apache project, so Free Software) it is a java library and a command line tool (the app module). I then scripted it with a bit of python to process the extracted text (yes it did that as well), and rename the image files.
I have recently purchased a Mac and have found out that it does not have any support for Batch (.bat) files at all. All I want is a website that will convert a Batch file program, to a Bash file program so I don't have to learn a whole new programming language, since I am so used to the Batch programming. Is there an online converter or possibly an app/program that could be found on a website or on the Mac App Store. If so, could you please tell me the name of the program, where I can find it, and a link to it. Thanks!
There is no possible way to do this. Why? Because not only are both the source and destination languages Turing complete, both are hideous piles of accumulated hacks from decades upon decades of maintenance programmers. And not only that, let's say you map all the syntax from Windows Batch to Bash (careful: version 3.2 only, Mac doesn't have newer!). Then what will you do when the original script invokes an external program? Will you know to map Windows Movie Maker to iMovie? Microsoft Office to iWork? Internet Explorer to Safari?
What about the fact that the two different systems have different rules about what constitutes a valid filename? If the source script mentions C:\Windows, what does that mean on a Mac?
There is a never ending amount of work required to do what you're asking. Perhaps you can narrow the scope (a lot).
This website here claims to have a way to manually do it. But, i do not have any Mac OS X computers to test if its true. Wish i could be more help.
I am repeatedly calling a matlab script MyMatlabScript from another program (written in Erlang). I am doing this using a batch file containing the following:
matlab -nodesktop -nosplash -wait -r "addpath('C:/...'); MyMatlabScript; %quit;"
This means that Matlab has to launch everytime I run the batch file script. It works but is slow*.
To improve performance I would like to be able to launch Matlab once and then somehow, using Erlang or a batch script, repeatedly initiate my Matlab script using that one instance of Matlab.
Can this be done?
Note, I am using Matlab 7.8.0 (R2009a) on Windows7.
*Extra slow due to issue outlined here!
It is not simple. But you can try using COM automation server interface in MATLAB. You need to have Erlang library for interfacing with COM automation servers. With this interface you can create an automation server and then keep sending commands to it. The documentation is available at http://www.mathworks.com/help/matlab/call-matlab-com-automation-server.html. In the documentation there are examples which use Visual Basic code.
I do not know whether passing messages into Matlab is a viable option, but I would like to propose an alternative. Matlab has a "timer" object, which lets you specify a call-back-function. In regular intervals, the Matlab call-back-function could check a file, which is changed by your Erlang program. A changed file triggers the desired Matlab routine. Well, it is not "haute cuisine" in terms of programming style, but it should do the job.
I have experience in just this. There are three predominant options:
Erlang command line calls to Matlab using os:cmd()
Writing a protocol that will require the two applications to be separate and communicate over tcpip. Benefit is the now Erlang is a server or vice-versa, however you code it. Challenge is the protocol code in Matlab, Erlang is particularly built for it.
Make a system pipe. If you're sticking to windows (NamedSystemPipe) then you really shouldn't have a problem finding docs around on how to do it.
I prefer Method 3 for local only comm and 2 for anything network based. Using 1 gives you the absolute least flexibility. There are more but since you're asking, this is what I recommend.
And best of all is that 'slow' problem is gone by not using 1.