Universal wiki Converter throws Out Of Memory Error - bash

I am trying to use uwc 4.0 to convert a moinmoin site, but running out of heap space, no matter how much memory i increase. Currently its (run_cmdline.sh)
# BEGIN
#!/bin/bash
MYCWD=`pwd`
CLASSPATHORIG=$CLASSPATH
CLASSPATH="uwc.jar"
for file in lib/*.jar ; do
CLASSPATH=$MYCWD/$file:$CLASSPATH
done
CLASSPATH=$CLASSPATH:$CLASSPATHORIG
export CLASSPATH
# run out of the sample_files dir
#cd sample_files
java -Xdebug -Xms2G -Xmx4G $APPLE_ARGS -classpath $CLASSPATH com.atlassian.uwc.ui.UWCCommandLineInterface $1 $2 $3 $4
## END
i run the following on command line:
sudo ./run_cmdline.sh conf/confluenceSettings.properties conf/converter.moinmoin.properties /opt/atlassian/moin/
P.S. If i use just ONE Small folder from the moinmoin pages directory, and try to export it, i get:
java.lang.NullPointerException
at java.util.Hashtable.put(Hashtable.java:542)
at com.atlassian.uwc.ui.ConverterEngine.createPageTable(ConverterEngine.java:2112)
at com.atlassian.uwc.ui.ConverterEngine.sendPage(ConverterEngine.java:2014)
at com.atlassian.uwc.ui.ConverterEngine.sendPage(ConverterEngine.java:1719)
at com.atlassian.uwc.ui.ConverterEngine.writePages(ConverterEngine.java:1356)
at com.atlassian.uwc.ui.ConverterEngine.convert(ConverterEngine.java:421)
at com.atlassian.uwc.ui.ConverterEngine.convert(ConverterEngine.java:215)
at com.atlassian.uwc.ui.UWCCommandLineInterface.convert(UWCCommandLineInterface.java:175)
at com.atlassian.uwc.ui.UWCCommandLineInterface.main(UWCCommandLineInterface.java:61)

Only Confluence 3.5 and lower versions are supported by UWC , and not any Confluence version above 3.5

Related

Snakemake conda env parameter is not taken from config.yaml file

I use a conda env that I create manually, not automatically using Snakemake. I do this to keep tighter version control.
Anyway, in my config.yaml I have the following line:
conda_env: '/rst1/2017-0205_illuminaseq/scratch/swo-406/snakemake'
Then, at the start of my Snakefile I read that variable (reading variables from config in your shell part does not seem to work, am I right?):
conda_env = config['conda_env']
Then in a shell part I hail said parameter like this:
rule rsem_quantify:
input:
os.path.join(fastq_dir, '{sample}_R1_001.fastq.gz'),
os.path.join(fastq_dir, '{sample}_R2_001.fastq.gz')
output:
os.path.join(analyzed_dir, '{sample}.genes.results'),
os.path.join(analyzed_dir, '{sample}.STAR.genome.bam')
threads: 8
shell:
'''
#!/bin/bash
source activate {conda_env}
rsem-calculate-expression \
--paired-end \
{input} \
{rsem_ref_base} \
{analyzed_dir}/{wildcards.sample} \
--strandedness reverse \
--num-threads {threads} \
--star \
--star-gzipped-read-file \
--star-output-genome-bam
'''
Notice the {conda_env}. Now this gives me the following error:
Could not find conda environment: None
You can list all discoverable environments with `conda info --envs`.
Now, if I change {conda_env} for its parameter directly /rst1/2017-0205_illuminaseq/scratch/swo-406/snakemake, it does work! I don't have any trouble reading other parameters using this method (like rsem_ref_base and analyzed_dir in the example rule above.
What could be wrong here?
Highest regards,
Freek.
The pattern I use is to load variables into params, so something along the lines of
rule rsem_quantify:
input:
os.path.join(fastq_dir, '{sample}_R1_001.fastq.gz'),
os.path.join(fastq_dir, '{sample}_R2_001.fastq.gz')
output:
os.path.join(analyzed_dir, '{sample}.genes.results'),
os.path.join(analyzed_dir, '{sample}.STAR.genome.bam')
params:
conda_env=config['conda_env']
threads: 8
shell:
'''
#!/bin/bash
source activate {params.conda_env}
rsem-calculate-expression \
...
'''
Although, I'd also never do this with a conda environment, because Snakemake has conda environment management built-in. See this section in the docs on Integrated Package Management for details. This makes reproducibility much more manageable.

Postprocess drmemory error stacks with new symbols after process exits

After running a set of tests with drmemory overnight I am trying to resolve the error stacks by providing pdb symbols. The pdb's come from a large samba-mapped repository and using _NT_SYMBOL_PATH at runtime slowed things down too much.
Does anyone know of a tool that post-processes results.txt and pulls new symbols (via NT_SYMBOL_PATH or otherwise) as required to produce more detailed stacks ? If not, any hints for adapting asan_symbolize.py to do this ?
https://llvm.org/svn/llvm-project/compiler-rt/trunk/lib/asan/scripts/asan_symbolize.py
What I came up with so far using dbghelp.dll is below. Works but could be better.
https://github.com/patraulea/postpdb
ok this Query does not pertain to use of windbg or doesn't have anything to do with _NT_SYMBOL_PATH
Dr.Memory is a memory diagnostic tool akin to valgrind and is based on Dynamorio instumentation framework usable on raw unmodified binaries
on windows you can invoke it like drmemory.exe calc.exe from a command prompt (cmd.exe)
as soon as the binary finishes execution a log file named results.txt is written to a default location
if you had setup _NT_SYMBOL_PATH drmemory honors it and resolves symbol information from prepulled symbol file (viz *.pdb) it does not seem to download files from ms symbol server it simply seems to ignore the SRV* cache and seems to use only the downstream symbol folder
so if the pdb file is missing or isnt downloaded yet
the results.txt will contain stack trace like
# 6 USER32.dll!gapfnScSendMessage +0x1ce (0x75fdc4e7 <USER32.dll+0x1c4e7>)
# 7 USER32.dll!gapfnScSendMessage +0x2ce (0x75fdc5e7 <USER32.dll+0x1c5e7>)
while if the symbol file was available it would show
# 6 USER32.dll!InternalCallWinProc
# 7 USER32.dll!UserCallWinProcCheckWow
so basically you need the symbol file for appplication in question
so as i commented you need to fetch the symbols for the exe in question
you can use symchk on a running process too and create a manifest file
and you can use symchk on a machine that is connected to internet
to download symbols and copy it to a local folder on a non_internet machine
and point _NT_SYMBOL_PATH to this folder
>tlist | grep calc.exe
1772 calc.exe Calculator
>symchk /om calcsyms.txt /ip 1772
SYMCHK: GdiPlus.dll FAILED - MicrosoftWindowsGdiPlus-
1.1.7601.17514-gdiplus.pdb mismatched or not found
SYMCHK: FAILED files = 1
SYMCHK: PASSED + IGNORED files = 27
>head -n 4 calcsyms.txt
calc.pdb,971D2945E998438C847643A9DB39C88E2,1
calc.exe,4ce7979dc0000,1
ntdll.pdb,120028FA453F4CD5A6A404EC37396A582,1
ntdll.dll,4ce7b96e13c000,1
>tail -n 4 calcsyms.txt
CLBCatQ.pdb,00A720C79BAC402295B6EBDC147257182,1
clbcatq.dll,4a5bd9b183000,1
oleacc.pdb,67620D076A2E43C5A18ECD5AF77AADBE2,1
oleacc.dll,4a5bdac83c000,1
so assuming you have fetched the symbols it would be easier to rerun the tests with a locally cached copies of the symbol files
if you have fetched the symbols but you cannot rerun the tests and have to work solely with the output from results.txt you have some text processing work (sed . grep , awk . or custom parser)
the drmemory suite comes with a symbolquery.exe in the bin folder and it can be used to resolve the symbols from results.txt
in the example above you can notice the offset relative to modulebase like
0x1c4e7 in the line # 6 USER32.dll!gapfnScSendMessage +0x1ce (0x75fdc4e7 {USER32.dll+0x1c4e7})
so for each line in results.txt you have to parse out the offset and invoke symbolquery on the module like below
:\>symquery.exe -f -e c:\Windows\System32\user32.dll -a +0x1c4e7
InternalCallWinProc+0x23
??:0
:\>symquery.exe -f -e c:\Windows\System32\user32.dll -a +0x1c5e7
UserCallWinProcCheckWow+0xb3
a simple test processing example from a result.txt and a trimmed output
:\>grep "^#" results.txt | sed s/".*<"//g
# 0 system call NtUserBuildPropList parameter #2
USER32.dll+0x649d9>)
snip
COMCTL32.dll+0x2f443>)
notice the comctl32.dll (there is a default comctl.dll in system32.dll and several others in winsxs you have to consult the other files like global.log to view the dll load path
symquery.exe -f -e c:\Windows\winsxs\x86_microsoft.windows.common-
controls_6595b64144ccf1df_6.0.7601.17514_none_41e6975e2bd6f2b2\comctl32.dll -a +0x2f443
CallOriginalWndProc+0x1a
??:0
symquery.exe -f -e c:\Windows\system32\comctl32.dll -a +0x2f443
DrawInsert+0x120 <----- wrong symbol due to wrong module (late binding
/forwarded xxx yyy reasons)

Elasticsearch standalone JDBC river feeder missing main class

I'm trying to setup the feeder following this instruction https://github.com/jprante/elasticsearch-jdbc#installation
I downloaded and unzipped the feeder
I don't quite understand this step:
run script with a command that starts org.xbib.tools.JDBCImporter with the lib directory on the classpath
what am I suppposed to do?
if I try to run a sample script from bin I get:
Bad substitution
Error: Could not find or load main class org.xbib.elasticsearch.plugin.jdbc.feeder.Runner
where do I get the java classes org.xbib.elasticsearch.plugin.jdbc.feeder.Runner \
org.xbib.elasticsearch.plugin.jdbc.feeder.JDBCFeeder?
figured out the solution
it was to set the installation folder in script (not the elasticsearch folder but the jdbc folder!)
#!/bin/bash
#JDBC Directory -> important, change accordingly!
export JDBC_IMPORTER_HOME=~/Downloads/elasticsearch-jdbc-1.6.0.0
bin=$JDBC_IMPORTER_HOME/bin
lib=$JDBC_IMPORTER_HOME/lib
echo '{
...
...
}
}' | java \
-cp "${lib}/*" \
-Dlog4j.configurationFile=${bin}/log4j2.xml \
org.xbib.tools.Runner \
org.xbib.tools.JDBCImporter

Why Jetty boot script remove [SK][0-9] from base name?

I read Jetty 9 boot script and found this:
#!/usr/bin/env bash
#
# Startup script for jetty under *nix systems (it works under NT/cygwin too).
##################################################
# Set the name which is used by other variables.
# Defaults to the file name without extension.
##################################################
NAME=$(echo $(basename $0) | sed -e 's/^[SK][0-9]*//' -e 's/\.sh$//')
# To get the service to restart correctly on reboot, uncomment below (3 lines):
# ========================
# chkconfig: 3 99 99
# description: Jetty 9 webserver
# processname: jetty
# ========================
I wonder why we should remove S / K and numbers from NAME.
/search/S2jetty.sh => jetty
Anybody who can explain it?
Many thanks!
That's trying to get the name from the init script that is currently running.
The links in the /etc/init.d directory all start with an S (for start) or a K (for kill) and a number (to control sort-order).
That letter+number combination is an artifact of the service system and not part of the name of the service in question so that prefix is being removed.

How to change the memory limits of Tomcat6 (MacPorts install)? (-Xmx ignored???)

I've tried changing the memory limits using the -Xmx flag in the catalina.sh, as I've used in the past for Linux installs, but when I access psi-probe (previously lambda probe) It claims I have a limit of 1.78Gb. I've tried setting the max limit to 4096m and 6144m and had no effect. The machine I'm running it on has adequate memory to support these configurations, but the limit is still reported as 1.78Gb.
I have a particularly heavy application that keels over with a heap space error at approx 1.6Gb.
Any suggestions as to why this config is being ignored or where it might be overwritten?
EDIT:
Contents of setenv.sh are:
#!/bin/sh
#
# setenv.sh
#
# You may edit this script to set defaults for such variables as JAVA_HOME.
#
# For Apple Java, the $JAVA_HOME is not well respected by the JNI launching code
# in jsvc. On Apple Java systems, you are better off setting JAVA_JVM_VERSION
# to the proper java name, such as 1.4, 1.5, or CurrentJDK, and let JAVA_HOME
# be calculated from that.
#
# First source the conf/setenv.local file to allow user to configure environment
# in an even more minimal fashion.
if [ -r "$CATALINA_HOME/conf/setenv.local" ]; then
. "$CATALINA_HOME/conf/setenv.local"
fi
# Attempt to set JAVA_HOME if it's not already set
if [ -z "$JAVA_HOME" ]; then
# Set JAVA_JVM_VERSION and JAVA_HOME for Darwin
if [ `uname -s` = "Darwin" ]; then
# Look for a java version specified by JAVA_JVM_VERSION, falling back to current version
# Set JAVA_HOME to reflect the version
for jversion in $JAVA_JVM_VERSION CurrentJDK ; do
jhome="/System/Library/Frameworks/JavaVM.framework/Versions/${jversion}/Home"
if [ -z "$JAVA_HOME" -a -d "${jhome}" ]; then
# Get the actual version that any symlink points to, since
# jni doesn't like JAVA_JVM_VERSION set to CurrentJDK
saved=`pwd`
cd "/System/Library/Frameworks/JavaVM.framework/Versions/${jversion}"
actualvers=$(basename $(pwd -P))
cd $saved
export JAVA_JVM_VERSION=${actualvers}
export JAVA_HOME=${jhome}
fi
done
fi
fi
setenv.local:
#!/bin/sh
#
# setenv.local
#
# This script, if present, is executed by tomcatctl through setenv.sh
# in order to set up any environment prior to executation of tomcat.
#
# For Apple Java, JAVA_JVM_VERSION may be used to specify a particular
# java version to run. It should be something like 1.4, 1.5, or CurrentJDK.
#export JAVA_JVM_VERSION=1.5
catalina.sh (partial, the file is quite long):
#JAVA_OPTS="-Xmx4096m -Xms4096m -XX:PermSize=6144m -XX:MaxPermSize=6144m $JAVA_OPTS"
CATALINA_OPTS="-Xmx4096m -Xms4096m -XX:PermSize=6144m -XX:MaxPermSize=6144m $CATALINA_OPTS"

Resources