WKHTMLTOPDF and "Error: Unable to create temporery file" - wkhtmltopdf

I've written a piece of code in PHP to generate PDF using WKHTMLTOPDF binary file. It was working fine till I had to recompile my Apache. Now it fails with error Error: Unable to create temporery file (this is the exact wording).
The situation in which the error is reproducible is a little complicated. I managed to narrow down the error and now I'm pretty sure that the error happens because of the user that Apache runs as. It seems to me that when WKTHMLTOPDF is running as a user with no home folder, it's unable to access a temporary folder within the user's home folder.
Surely I can change the Apache's user but I would rather resolve this problem once and for all. To this end it would be great if I could somehow set the temp folder for WKHTMLTOPDF or at least print its current value to make it valid! Does anyone know how to do any of these two?
BTW, I'm using WKHTMLTOPDF 0.11.0 rc1.

I saw the same error today in Rails4 + pdfkit gem(0.8.2) + wkhtmltopdf(0.12.2.1) under CentOS 6.7.
This error came from wkhtmltopdf and the reason was it couldn't create temporary file. wkhtmltopdf depends on some temporary filename creation API (I'm not sure), but probably following shows some hints:
$ man tempfile
$ man tempnam
In my case, my TMPDIR environment variable showed wrong path (I had accidentaly deleted the directory!) so that wkhtmltopdf couldn't create work file.
When I unset TMPDIR, then it worked! Of course, setting correct existence directory to TMPDIR should be OK too.

Related

Google Cloud Functions and shared libraries

I'm trying to use wkhtmltopdf on GCF for PDF generation.
When my function tries to spawn the child process I get the following error:
Error: ./services/wkhtmltopdf: error while loading shared libraries: libXrender.so.1: cannot open shared object file: No such file or director
The problem is clearly due to the fact that wkhtmltopdf binary depends on external shared libraries which are not installed in GCF environment.
Is there a way to solve this issue or should I give up and use other solutions (AWS Lambda o GAE)?
Thank you in advance
Indeed, I’ve found a way to solve this issue by copying all required libraries in the same folder (/bin for me) containing wkhtmltopdf binary. In order to let the binary file use uploaded libraries I added the following lines to wkhtmltopdf.js:
wkhtmltopdf.command = 'LD_LIBRARY_PATH='+path.resolve(__dirname, 'bin')+' ./bin/wkhtmltopdf';
wkhtmltopdf.shell = '/bin/bash';
module.exports = wkhtmltopdf;
Everything worked fine for a while. At a sudden I receive many connection errors from GCF or timeouts but I think it’s not related to my implementation but rather to Google.
I’ve ended up setting a dedicated server.
I have managed to get it working, there are 2 things needed to be done, as wkhtmltopdf won't work if:
libXrender.so.1 can't be loaded
you are using stdout to collect resulting pdf. Wkhtmltopdf has to write the result into a file
First you need to obtain correct version of libXrender.
I have found out, which docker image Cloud functions are using as base for nodejs functions. I've ran it locally, installed libxrender and copied the library into my function's directory.
docker run -it --rm=true -v /tmp/d:/tmp/d gcr.io/google-appengine/nodejs bash
Then, inside the runing container:
apt update
apt install libxrender1
cp /usr/lib/x86_64-linux-gnu/libXrender.so.1 /tmp/d
I have put this into my function's project directory and under lib sub directory. In my function's source file, I then set-up LD_LIBRARY_PATH to include the /user_code/lib directory (/user_code is the directory, where at last your function will end up being put by google):
process.env['LD_LIBRARY_PATH'] = '/user_code/lib'
This is enough for wkhtmltopdf to be able to execute. It will fail, as it won't be able to write to stdout and the function will eventually timeout and be killed (as Matteo experienced). I think this is because google runs the containers without a tty (just speculation), I can run my code in their container, if I run it with docker run -it flags. To solve this, I am invoking wkhtmltopdf so that it writes the output into a file under /tmp (this is in-memory tmpfs). I then read the file back and send it as my response body. Note that the tmpfs might be reused between function calls, so you need to use unique file every time.
This seems to do the trick and I am able to run wkhtmltopdf as Google CloudFunction.

Elasticsearch failed to start - CreateJavaVM Failed

sorry if this is a trivial question, but I've been banging my head against this and I'm getting nowhere, so I thought I'd throw it up.
I'm trying to install Elasticsearch on an Windows 2008 server on Azure. It appears to have installed correctly, but I cannot get it to start.
I have looked around for similar errors and double-checked my JAVA_HOME variable - it appears to be correct, as does the config file.
I also expanded out the heap size via editing the java options files, still no luck.
Any help would be greatly appreciated
Output Log file
JAVA_HOME variable
Which version are you installing? The latest, which would be 5.5.0?
Which installation method did you use? The ZIP or the MSI file?
The last line in your Output Log file screenshot actually shows the error message: The data area passed to a system call is too small.
I'm taking a wild guess: You set Java_HOME in the user variables, but it must be set in the system variables.

nutch 1.10 input path does not exist /linkdb/current

When I run nutch 1.10 with the following command, assuming that TestCrawl2 did not previously exist and needs to be created,...
sudo -E bin/crawl -i -D solr.server.url=http://localhost:8983/solr/TestCrawlCore2 urls/ TestCrawl2/ 20
I receive an error on indexing that claims:
Indexer: org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: file:/opt/apache-nutch-1.10/TestCrawl2/linkdb/current
The linkdb directory exists, but does not contain the 'current' directory. The directory is owned by root so there should be no permissions issues. Because the process exited from an error, the linkdb directory contains .locked and ..locked.crc files. If I run the command again, these lock files cause it to exit in the same place. Delete TestCrawl2 directory, rinse, repeat.
Note that the nutch and solr installaions themselves have run previously without problems in a TestCrawl instance. It's just now that I'm trying a new one that I'm having problems. Any suggestions on troubleshooting this issue?
Ok, it seems as though I have run into a version of this problem:
https://issues.apache.org/jira/browse/NUTCH-2041
Which is a result of the crawl script not being aware of changes to ignore_external_links my nutch-site.xml file.
I am trying to crawl several sites and was hoping to keep my life simple by ignoring external links and leaving regex-urlfilter.txt alone (just using +.)
Now it looks like I'll have to change ignore_external_links back to false and add a regex filter for each of my urls. Hopefully I can get a nutch 1.11 release soon. It looks like this is fixed there.

how to customize login page for shibboleth idp

I would like to customize the login page and I'm trying to follow the shibboleth wiki, but I'm not sure where to find " src/main/webapp/login.jsp within your IdP distribution package" in order to modify it. My shibboleth resides in /opt/shibboleth-idp, but I don't have a src folder in there. Any help would be appreciated.
For IdP version 3, you can customize by changing the files in the "views" directory. These are Apache Velocity templates, and you can make changes that become active without having to rebuild the war file.
(sorry this is two months late, but...)
the files for login are not stored inside your shibboleth-idp directory. (well, they're sorta in there...rolled into the java war file.)
somewhere, there should be a directory that was used to build your shibboleth-idp instance. many times i've seen it in the same folder as the shibboleth-idp folder, but it doesn't have to be. so since yours is /opt/shibboleth-idp, it might be at /opt/shibboleth-identityprovider-version.number. if not, use the find command as already suggested, but maybe try something like
find / -name 'shibboleth-identityprovider*' -ls 2>/dev/null
unless someone built it off-box, that folder should exist somewhere. inside there is the src directory where login.jsp resides.
the install script the shib doc tells you to run after making your changes is at the top level of that shibboleth-identityprovider-version.number folder too (install.sh for unix). when you run the install script, you tell it where to put the idp files (in your case, /opt/shibboleth-idp).
also, before running the install script, it's a good idea to back up your conf directory. you might accidentally tell the install script to overwrite it. or it might do it even if you told it not to (bug in some versions).
I recommend starting with the Linux find command:
find /opt/shibboleth-idp/ -name login.jsp

Robot framework doesn't see environment variable selenium_jar

i'm using mac
while using a pybot test.txt command i get an error msg: No executable path given, please add one to Environment Variable 'SELENIUM_SERVER_JAR'
but
i downloaded this jar from selenium site and did such things, nothing worked out:
added this folder path to /etc/paths;
added variable to etc/launchd.conf;
setenv SELENIUM_SERVER_JAR /Users/User/Downloads/SeleniumServer/selenium-server-standalone.jar
export PATH=$PATH:/Users/User/Downloads/SeleniumServer/;
after all of this i tried i run pybot test.txt and receiving the same error message.
Please, help, i just don't know how to configure robot framework to work with safari.
Based on the response to comments, it appears you set SELENIUM_JAR_FILE to the wrong value. It needs to be set to the full path of the jar file, not just the folder that contains the jar file.

Resources