How can I migrate my MongoDB to RethinkDB? - rethinkdb

How can I migrate my MongoDB collections to RethinkDB tables?
I'm not concerned about changing my old Mongo _id's to Rethink id's because they will be ignored in my implementation, and I'm not concerned about them cluttering my data.

I wrote a quick BASH script to solve this. Because I only had the JavaScript RethinkDB driver, I had to install the python driver first so I could use rethinkdb import.
For this example, I am migrating the mongo collections: users, pinboards, and analytics; add your collections as needed to the following BASH command:
for collection in users pinboards analytics; \
do \
mongoexport \
--host my.mongo.server \
--db my_mongo_database \
--collection $collection \
> $collection.json; \
rethinkdb import \
--file $collection.json \
--table my_rethink_database.$collection; \
rm $collection.json; \
done
Don't forget to change the names of your collections, host, and database. Also, adjust arguments to mongoexport and rethinkdb import as needed.
This is quick and dirty, but I hope it gets someone started in the right direction!

Related

Role not created after using psql -c "create user...."

I'm trying to create a script that creates readonly users but while trying to create a user my database doesn't update with the new user. The terminal returns a "CREATE ROLE" and no errors.
I've tried setting -U -h -p and calling the current_user() which shows that I'm the superuser making the call.
psql \
-X \
-U $PGUSERNAME
-h $DBHOST \
-p $PORT \
-c "create user readonly with password 'welcome';" \
-d $DATABASENAME \
I'm connected to my postgres via goland and when I run the shell file which runs that psql the terminal returns a "CREATE ROLE" but the postgres isn't reflecting any new roles and when I run the query in the PostgresSQL console to find my readonly user 0 rows are found. I expected the shell script to create a user and update my local database.
As Laurenz Albe mentioned in the comment removing the \ at the end resolved this issue.

Mongoexport with Query using shell script

I am calling mongoexport using shell script but it keeps failing. My script is :--
mongo localhost:27038/admin -u test -p mongo123 < mongoexport.js
and my mongoexport.js file is :--
use db1
mongoexport --type=csv --fields _id -q '{"_id": "323549991" , "cat" : "ABC"}' --out report.csv
But every time I run it fails with below error :--
switched to db db1
2018-01-10T17:36:15.495+0000 E QUERY [thread1] SyntaxError: missing ; before statement #(shell):1:14
bye
Now I am not sure where exactly I am messing up the syntax .
Regards.
It looks like you are connecting to your mongo. You don't need to do that in order to execute mongoexport.
You just need to connect to your host (not mongo). Take a look at the official documentation
This data resides on the MongoDB instance located on the host
mongodb1.example.net running on port 37017, which requires the
username user and the password pass.
mongoexport --host mongodb1.example.net --port 37017 --username user
--password "pass" --collection contacts --db marketing --out mdb1-examplenet.json
In your case it should look like that (Untested)
mongoexport --host localhost --port 27038 --username test --password "mongo123" --db admin --collection db1 --type=csv --fields _id -q '{"_id": "323549991" , "cat" : "ABC"}' --out report.csv
I assumed your database is called admin and your collection db1, if not replace them accordingly.

mongodump to another host

I have the following script:
mongodump --gzip -d foobar \
--excludeCollection=foo1 \
--excludeCollection=foo2 \
--excludeCollection=foo3 \
--excludeCollection=foo4 -o ./
But the dump is too large for the server it's on, it's literally taking up all the disk space. Is there anyway to make it dump to another host? Maybe using scp?
The easiest thing to do is to use mongodump from anther computer, if the database is accessible, using the --host parameter and any credentials you may need. It's quite similar to using the mongo shell to connect to a remote instance.

How to write a Sqoop Job using shell script

My use case:
Day by day hourly tables gets created in mysql db. I need to move them everyday to HDFS using Sqoop and process the HDFS data using Impala.
How to write a shell script or job only to move the tables data that are newly created to HDFS(existing file system) periodically?
Say today is 3rd of Jan 2016, when I run my job today then 2nd Jan 2016 data should be moved from mysql to HDFS like wide everyday it should move the data of previous day.
Daily I need to run my Impala queries on this HDFS cluster and generate a report.
How to process this whole data using Impala and generate a report?
Sqoop supports incremental import: http://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html#_incremental_imports
Sqoop-import can use either a last-modified timestamp, or an always-increasing row ID to decide which rows to import. You need to provide a --last-value parameter. You could store the last value in between jobs, or maybe retrieve it from your Impala database before running the job.
The best way would be to get shell script that takes 2 arguments. 1 would be the name of the table and other would be hdfs path because those would be the only 2 factors that change in your scenarios. Below is sample script that you can put into .sh and run in bash.
!/bin/bash
TABLENAME=${^^1}
HDFSPATH=${^^2}
NOW=$(date +"%m-%d-%Y-%H-%M-%S")
sqoop --import --connect jdbc:db2://mystsrem:60000/SCHEMA \
--username username \
--password-file password \
--query "select * from ${TABLENAME} \$CONDITIONS" \
-m 1 \
--delete-target-dir \
--target-dir ${HDFSPATH} \
--fetch-size 30000 \
--class-name ${TABLENAME} \
--fields-terminated-by '\01' \
--lines-terminated-by '\n' \
--escaped-by '\' \
--verbose &> logonly/${TABLENAME}_import_${NOW}.log
OPTIONAL:
if you need to import into hive table
--hive-drop-import-delims \
--hive-import \
--hive-overwrite \
--hive-table HiveSchema.${TABLENAME}

How to export 4 million data in MySQL?

I have a database with one particular table having more than 4 million record entries. I tried downloading whole db it using MySQL workbench as well as command terminal using following command:
mysqldump -u root -p password mydb > myfile.sql
But, I got only half of the data downloaded. If I ignored that one particular table, then it's working fine. Can anyone suggest me how to download db with tables having more than million entries?
Try adding the below lines in the my.cnf and restart
[mysqld]
# Performance settings used for import.
delay_key_write=ALL
bulk_insert_buffer_size=256M
or
mysqldump -u root -p --max_allowed_packet=1073741824 --lock-tables=false mydb > myfile.sql

Resources