RethinkDB import error - rethinkdb

I'm trying to import CSV or JSON file to Rethink DB but I always get the same error:
rethinkdb import -f ~/Downloads/convertcsv.json --table test.stats --format json
[ ] 0%
0 rows imported in 1 table
'indexes'
In file: /home/xxxxx/Downloads/convertcsv.json
Errors occurred during import
I don't see anything in logs and the same files import ok on my laptop.
Import creates the table but that's about it.
My system:
- List item
- Ubuntu 10.10
- Python 2.7.8
- rethinkdb 1.16.0+1~0utopic (GCC 4.9.1)
Already tried to re-install RethinkDB, sudo pip2 install --upgrade rethinkdb. Not sure what else I can do.

This appears to have been an oversight when adding export/import of secondary indexes - the import script is looking for the indexes field in the info, which doesn't exist when importing a single file. This can be worked around by providing the flag --no-secondary-indexes. A fix was released in the RethinkDB Python driver version 1.16.0-2, see the Github issue #3278 for details.

Related

Can't use 'put'() to add data to hbase with happybase

My python version is 3.7, and after I ran pip3 install happybase, I started the command hbase thrift start and tried to write a brief .py file as following:
import happybase
connection = happybase.Connection('master')
table = connection.table('jmlr') #'jmlr' is a table in hbase
for i in table.scan():
print(i)
table.put('001', {'title':'dasds'}) #error here
connection.close()
When it's about to run table.put(), it reported such an error:
thriftpy2.transport.base.TTransportException: TTransportException(type=4, message='TSocket read 0 bytes')
And at the same time, the thrift reported an error:
ERROR [thrift-worker-1] thrift.TBoundedThreadPoolServer: Error occurred during processing of message. java.lang.IllegalArgumentException: Invalid famAndQf provided.
But just now I ran this python file again, it gave me a different error in thrift:
thrift.TBoundedThreadPoolServer: Thrift error occurred during processing of message.
org.apache.thrift.protocol.TProtocolException: Bad version in readMessageBegin
I have tried to add parameters like protocol='compact', transport='framed', but this didn't work, even the table.scan() failed.
Everything in the hbase shell is OK, so I can't figure out what went wrong, I'm about to collapse.
I ran into the same issue and found this sollution. You need to add even empty Column Qualifier ( ':' symbol as delimiter between Column Family and Column Qualifier) into put() method:
table.put('001:', {'title':'dasds'})
Also, you have a different error message after second run of script because thrift server is already failed.
I hope it will help you.

Heroku worker suddenly crashes, logs don't show any of my scripts. What happened?

I have a flask, gunicorn, postgresql project hosted on heroku and it suddenly failed. I can access the logs, but there is no script that I wrote, so I am confused. I haven't added anything between "working" and "not working" so I don't know where I can start.
The log can be found in this pastebin.
The last part is:
2020-02-06T21:09:02.748093+00:00 app[web.1]: from werkzeug.contrib.cache import FileSystemCache
2020-02-06T21:09:02.748100+00:00 app[web.1]: ModuleNotFoundError: No module named 'werkzeug.contrib'
2020-02-06T21:09:02.748789+00:00 app[web.1]: [2020-02-06 21:09:02 +0000] [10] [INFO] Worker exiting (pid: 10)
I tried to add werkzeug to the requirements.txt, but that did not help. Which would have been strange anyway, because it was working fine without the change in the requirement.
If you could help me reducing the requirements.txt, it would be greatly appreciated.
Original requirements.txt:
cs50
Flask
Flask-Session
requests
gunicorn
psycopg2-binary
openpyxl
New, working ones:
astroid==2.3.3
attrs==19.3.0
Authlib==0.13
autopep8==1.5
awscli==1.17.9
backports.shutil-get-terminal-size==1.0.0
backports.shutil-which==3.5.2
beautifulsoup4==4.8.2
botocore==1.14.9
bs4==0.0.1
cairocffi==1.1.0
CairoSVG==2.4.2
certifi==2019.11.28
cffi==1.13.2
chardet==3.0.4
check50==3.0.10
Click==7.0
colorama==0.4.1
compare50==1.1.2
cryptography==2.8
cs50==5.0.3
cssselect2==0.2.2
cycler==0.10.0
defusedxml==0.6.0
docutils==0.15.2
EditorConfig==0.12.2
et-xmlfile==1.0.1
Flask==1.1.1
Flask-Session==0.3.1
help50==3.0.0
html5lib==1.0.1
icdiff==1.9.1
idna==2.8
ikp3db==1.4.1
intervaltree==2.1.0
isort==4.3.21
itsdangerous==1.1.0
jdcal==1.4.1
jellyfish==0.7.2
Jinja2==2.11.1
jmespath==0.9.4
jsbeautifier==1.10.3
kiwisolver==1.1.0
lazy-object-proxy==1.4.3
lib50==2.0.7
logger==1.4
MarkupSafe==1.1.1
matplotlib==3.1.3
mccabe==0.6.1
natsort==7.0.1
nltk==3.4.5
numpy==1.18.1
oauthlib==3.1.0
openpyxl==3.0.3
pandas==1.0.0
pexpect==4.8.0
Pillow==7.0.0
plotly==4.5.0
psycopg2-binary==2.8.4
ptyprocess==0.6.0
pyasn1==0.4.8
pycodestyle==2.5.0
pycparser==2.19
Pygments==2.5.2
pylint==2.4.4
pylint-django==2.0.13
pylint-flask==0.6
pylint-plugin-utils==0.6
pyparsing==2.4.6
PyPDF2==1.26.0
Pyphen==0.9.5
python-dateutil==2.8.1
python-magic==0.4.15
pytz==2019.3
PyYAML==5.2
render50==3.1.3
requests==2.22.0
requests-oauthlib==1.3.0
retrying==1.3.3
rsa==3.4.2
s3cmd==2.0.2
s3transfer==0.3.2
six==1.14.0
sortedcontainers==2.1.0
soupsieve==1.9.5
SQLAlchemy==1.3.13
sqlparse==0.3.0
style50==2.7.4
submit50==3.0.2
termcolor==1.1.0
tinycss2==1.0.2
tqdm==4.42.1
twython==3.7.0
typed-ast==1.4.1
urllib3==1.25.8
virtualenv==16.7.9
WeasyPrint==49
webencodings==0.5.1
Werkzeug==0.16.1
wrapt==1.11.2
gunicorn
Werkzeug released a new version yesterday :
Release history
Apparently werkzeug.contrib has been moved to a separate module
It is recommended to try
./env/bin/pip install werkzeug==0.16.0
Here's another solution that might work for you.
Since that Werkzeug 1.0.0 has removed deprecated code from werkzeug.contrib.
I ran into the same issue when trying to use Proxyfix in werkzeug==1.0.0
After downgrading to werkzeug==0.16.0, I got these warnings:
DeprecationWarning: 'werkzeug.contrib.fixers.ProxyFix' has moved to 'werkzeug.middleware.proxy_fix.ProxyFix'.
This import is deprecated as of version 0.15 and will be removed in 1.0.
DeprecationWarning: 'werkzeug.contrib.cache' is deprecated as of version 0.15 and will be removed in version 1.0. It has moved to pallets.
from werkzeug.contrib.cache import FileSystemCache
To fix the issues:
pip install werkzeug==1.0.0
For ProxyFix:
from werkzeug.middleware.proxy_fix import ProxyFix
For FileSystemCache, you will have to install pallets:
pip install -U cachelib
from cachelib.file import FileSystemCache
I hope this helps and fix your issues 😄

Define setup.py dependencies from a private PyPI

I'd like to install dependencies from my private PyPI by specifying them within a setup.py.
I've already tried to specify where to find dependencies within the dependency_links this way:
setup(
...
install_requires=["foo==1.0"],
dependency_links=["https://my.private.pypi/"],
...
)
I've also tried to define the entire URL within the dependency_links:
setup(
...
install_requires=[],
dependency_links=["https://my.private.pypi/foo/foo-1.0.tar.gz"],
...
)
but when I try to install with python setup.py install, neither of them worked for me.
Can anybody help me?
EDITS:
With the first piece of code I got this error:
...
Installed .../test-1.0.0-py3.7.egg
Processing dependencies for test==1.0.0
Searching for foo==1.0
Reading https://my.private.pypi/
Reading https://pypi.org/simple/foo/
Couldn't find index page for 'foo' (maybe misspelled?)
Scanning index of all packages (this may take a while)
Reading https://pypi.org/simple/
No local packages or working download links found for foo==1.0
error: Could not find suitable distribution for Requirement.parse('foo==1.0')
while in the second case I didn't get any error, just the following:
...
Installed .../test-1.0.0-py3.7.egg
Processing dependencies for test==1.0.0
Finished processing dependencies for test==1.0.0
UPDATE 1:
I've tried to change the setup.py following sinoroc's instructions. Now my setup.py looks like this:
setup(
...
install_requires=["foo==1.0"],
dependency_links=["https://username:password#my.private.pypi/folder/foo/foo-1.0.tar.gz"],
...
)
I built the library test with python setup.py sdist and tried to install it with pip install /tmp/test/dist/test-1.0.0.tar.gz, but I still get this error:
Processing /tmp/test/dist/test-1.0.0.tar.gz
ERROR: Could not find a version that satisfies the requirement foo==1.0 (from test==1.0.0) (from versions: none)
ERROR: No matching distribution found for foo==1.0 (from test==1.0.0)
Regarding the private PyPi, I don't have any additional information because I'm not the administrator of it. As you can see, I just have the credentials (username and password) for that server.
Additionally, that PyPi is organised in sub-folders, https://my.private.pypi/folder/.. where the dependency I want to install is.
UPDATE 2:
By running pip install --verbose /tmp/test/dist/test-1.0.0.tar.gz, it seams there is only 1 location where to search for the library foo, in the public server https://pypi.org/simple/foo/ and not in our private server https://my.private.pypi/folder/foo/.
Here the output:
...
1 location(s) to search for versions of foo:
* https://pypi.org/simple/foo/
Getting page https://pypi.org/simple/foo/
Found index url https://pypi.org/simple
Looking up "https://pypi.org/simple/foo/" in the cache
Request header has "max_age" as 0, cache bypassed
Starting new HTTPS connection (1): pypi.org:443
https://pypi.org:443 "GET /simple/foo/ HTTP/1.1" 404 13
Status code 404 not in (200, 203, 300, 301)
Could not fetch URL https://pypi.org/simple/foo/: 404 Client Error: Not Found for url: https://pypi.org/simple/foo/ - skipping
Given no hashes to check 0 links for project 'foo': discarding no candidates
ERROR: Could not find a version that satisfies the requirement foo==1.0 (from test==1.0.0) (from versions: none)
Cleaning up...
Removing source in /private/var/...
Removed build tracker '/private/var/...'
ERROR: No matching distribution found for foo==1.0 (from test==1.0.0)
Exception information:
Traceback (most recent call last):
...
In your second attempt, I believe you should still have foo==1.0 in the install_requires.
Update
Be aware that pip does not support dependency_links (it used to, but does not anymore).
For pip, the alternative is to use command line options such as --index-url, --extra-index-url, or --find-links. These options can not be enforced on the user of your project (contrary to the dependency links from setuptools), so they have to be properly documented. To facilitate this, a good idea is to provide an example of a requirements.txt file to the users of your project. This file can contain some of pip options.
For example:
# requirements.txt
# ...
--find-links 'https://my.private.pypi/'
foo==1.0
# ...

gotext: extract failed: pipeline: golang.org/x/text/message is not imported

I am trying to run the following command from within my template.go file:
//go:generate gotext -srclang=en update -out=catalog.go -lang=en,de_DE,es_MX,fr_CA,pt_BR
I am expected to get a catalog.go generated, but instead, I get the following error:
gotext: extract failed: pipeline: golang.org/x/text/message is not imported
template.go:3: running "gotext": exit status 1
I do have the following import in the template.go after the generate command:
import (
"time"
log "github.com/sirupsen/logrus"
"golang.org/x/text/message"
)
I've tried to move the import before the generate command. I've also tried to run generate ./... from within the root of the project. I've also tried to run gotext by itself, but it's the same error message.
I also found the following thread on github:
https://github.com/golang/go/issues/26312
I've tried what was suggested there, but it didn't seem to have solved the issue either.
I have solved the issue by running rm -rf vendor/golang.org/x/text command from the root of the project. Of course for things to work, I also needed to have gotext installed. This can be done by running go get golang.org/x/text/cmd/gotext.
I believe the issue could be solved if binaries of .../text/message are installed in the GOPATH/bin as well

Pig UDF running on AWS EMR with java.lang.NoClassDefFoundError: org/apache/pig/LoadFunc

I am developing an application that try to read log file stored in S3 bucks and parse it using Elastic MapReduce. Current the log file has following format
-------------------------------
COLOR=Black
Date=1349719200
PID=23898
Program=Java
EOE
-------------------------------
COLOR=White
Date=1349719234
PID=23828
Program=Python
EOE
So I try to load the file into my Pig script, but the build-in Pig Loader doesn't seems be able to load my data, so I have to create my own UDF. Since I am pretty new to Pig and Hadoop, I want to try script that written by others before I write my own, just to get a teast of how UDF works. I found one from here http://pig.apache.org/docs/r0.10.0/udf.html, there is a SimpleTextLoader. In order to compile this SimpleTextLoader, I have to add a few imports, as
import java.io.IOException;
import java.util.ArrayList;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.InputFormat;
import org.apache.hadoop.mapreduce.RecordReader;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigSplit;
import org.apache.pig.backend.executionengine.ExecException;
import org.apache.pig.data.Tuple;
import org.apache.pig.data.TupleFactory;
import org.apache.pig.data.DataByteArray;
import org.apache.pig.PigException;
import org.apache.pig.LoadFunc;
Then, I found out I need to compile this file. I have to download svn and pig running
sudo apt-get install subversion
svn co http://svn.apache.org/repos/asf/pig/trunk
ant
Now i have a pig.jar file, then I try to compile this file.
javac -cp ./trunk/pig.jar SimpleTextLoader.java
jar -cf SimpleTextLoader.jar SimpleTextLoader.class
It compiles successful, and i type in Pig entering grunt, in grunt i try to load the file, using
grunt> register file:/home/hadoop/myudfs.jar
grunt> raw = LOAD 's3://mys3bucket/samplelogs/applog.log' USING myudfs.SimpleTextLoader('=') AS (key:chararray, value:chararray);
2012-12-05 00:08:26,737 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. org/apache/pig/LoadFunc Details at logfile: /home/hadoop/pig_1354666051892.log
Inside the pig_1354666051892.log, it has
Pig Stack Trace
---------------
ERROR 2998: Unhandled internal error. org/apache/pig/LoadFunc
java.lang.NoClassDefFoundError: org/apache/pig/LoadFunc
I also try to use another UDF (UPPER.java) from http://wiki.apache.org/pig/UDFManual, and I am still get the same error by try to use UPPER method. Can you please help me out, what's the problem here? Much thanks!
UPDATE: I did try EMR build-in Pig.jar at /home/hadoop/lib/pig/pig.jar, and get the same problem.
Put the UDF jar in the /home/hadoop/lib/pig directory or copy the pig-*-amzn.jar file to /home/hadoop/lib and it will work.
You would probably use a bootstrap action to do either of these.
Most of the Hadoop ecosystem tools like pig and hive look up $HADOOP_HOME/conf/hadoop-env.sh for environment variables.
I was able to resolve this issue by adding pig-0.13.0-h1.jar (it contains all the classes required by the UDF) to the HADOOP_CLASSPATH:
export HADOOP_CLASSPATH=/home/hadoop/pig-0.13.0/pig-0.13.0-h1.jar:$HADOOP_CLASSPATH
pig-0.13.0-h1.jar is available in the Pig home directory.

Resources