I was using hive CLI to directly run the INSERT OVERWRITE LOCAL DIRECTORY 'local/machine/folder/location' SELECT * FROM table.
The Hive CLI would download/get the file to the client machine location.
Now I'm moving to beeline. The same command invoked through beeline would download/get the file to hiveserver2 machine.
beeline -u ${hive_resource_jdbcurl} ${hive_resource_username} ${hive_resource_password} ${hive_resource_driverclass} -S -e "${insert_overwrite_command}"
Want to know if there is any way to get the file to the client instead of hiveserver2 machine.
Eg.,
HiverServer2 machine - HS2_Machine
AppServer/WebServer machine - App1_Machine
Beeline command(Insert overwrite local directory) will be triggered from App1_Machine which will move the output to local directory in HS2_Machine. I want to know if there is a way/command to get the file to appserver App1_Machine local.
PS: Don't want to scp or move the file from HS2_Machine to AppServer using scp/ftp, because I'm dealing with huge volume and I don't want two operation (Storing it in HS2_Machine and then moving that huge file into App1_Machine).
Related
My environment contains clusters with multiple hosts in each cluster and as such I tend to run similar or equivalent commands on the hosts in such a cluster.
Sometimes, I am ssh-ed into a cluster host and remember that I had run a certain command on another host in this cluster but I can't remember which host I ran it on, however I need to run that command again.
Since every host in the cluster has its own .bash_history, I have to log in to each and every one of them and look through the .bash_history file to locate that command.
However, if I could use one .bash_history file for all hosts in the cluster (e.g. named .bash_history.clusterX) then I would be able to search the command in the bash history (with CTRL+R) and execute it.
Is that possible?
In my setup shared home directory (via nfs, etc.) is not an option.
Another approach is to leave the relevant commands to execute in an executable file ('ssh_commands') in the home folder of each remote user on each machine.
Those ssh_commands will include the commands you need to execute on each server whenever you open an SSH session.
To call that file on each SSH session:
ssh remoteUser#remoteServer -t "/bin/bash --init-file <(echo 'source ssh_commands')"
That way, you don't have to look for the right commands to execute, locally or remotely: your SSH session opens and execute right away what you want.
I am running the bash files to make a Mongo dump on daily bases.But In local directory I am running a one bash file which connects to server terminal.And in server terminal I am running the other file which makes a Mongo dump.
But is it possible to make one file which connects to MongoDB server terminal and run the commands on the sever.
I tried with many commands but it was not possible to run the commands on the server terminal with one bash file, when the server terminal opens up then the left over commands does not execute.
Is it possible to do one bash file and execute the server commands on the server..?
Connect to your DB remotely using this command :
mongo --username username --password secretstuff --host YOURSERVERIP --port 28015
You can then automate this by including your pertaining commands ( including the above ) in a bash script that you can run from anywhere.
To solve the above problem, answer from Matias Barrios seems to be correct for me. You don't use a script on the server, but use tools on your local machine that connect to the server services and manage them.
Nevertheless, to execute a script on a distant server, you could use ssh. This is not the right solution in your case, but answer the question in your title.
ssh myuser#MongoServer ./script.sh param1
This can be used in a local script and execute script.sh on the server MongoServer (with param1 and) with system privileges of the user myuser.
Beforehand, don't forget to avoid password request with
ssh-copy-id myuser#MongoServer
This will copy your ssh public key in the myuser directory of the MongoServer
What is the best approach to move files from one Linux box to HDFS should I use flume or ssh ?
SSH Command:
cat kali.txt | ssh user#hadoopdatanode.com "hdfs dfs -put - /data/kali.txt"
Only problem with SSH is I need to mention password every time need to check how to pass password without authentication.
Can flume move files straight to HDFS from one server?
Maybe you can make passwordless-ssh, then transfer files without entering password
Maybe you create a script in python for example which does the job for you
You could install hadoop client on a Linux box that has the files. Then you could "hdfs dfs -put" your data directly from that box to hadoop cluster.
If I run pig#hadoop in the local mode (because I do not want to use hdfs) then it process my scripts in single-thread/single process mode. If I setup the hadoop in pseudo mode (hdfs with replication=1) then pig#hadoop does not like my file:///...:
traj = LOAD 'file:///root/traj'
USING org.apache.pig.piggybank.storage.CSVExcelStorage(
';', 'NO_MULTILINE', 'UNIX', 'SKIP_INPUT_HEADER'
) AS
(
a1:chararray,
a2:long,
a3:long,
a4:float,
a5:float,
a6:float,
a7:chararray,
a8:float,
a9:chararray
);
c = FOREACH (GROUP traj ALL) GENERATE COUNT(traj);
dump c;
is there any way to tell pig#hadoop to process the files in multi-core mode without putting the files into hdfs?
Local Mode - To run Pig in local mode, you need access to a single machine; all files are installed and run using your local host and file system. Specify local mode using the -x flag (pig -x local).
Mapreduce Mode - To run Pig in mapreduce mode, you need access to a Hadoop cluster and HDFS installation. Mapreduce mode is the default mode; you can, but don't need to, specify it using the -x flag (pig OR pig -x mapreduce).
Source : http://pig.apache.org/docs/r0.9.1/start.html#execution-modes
If you want to run it in local mode, you should switch pig into local mode using command $ pig -x local. By default pig run on MapReduce mode and reads data from HDFS.
To run Pig in local mode, you only need access to a single machine. To make things simple, copy your files to your current working directory (you may want to create a temp directory and move to it) provide that location in your pig script.
So I have a batch server that runs a batch script. This script issues a mysqldump command for our db server.
mysqldump -h nnn.nn.nnn.nn -u username -p password --tab=/var/batchfiles/ --fields-enclosed-by='"' --fields-terminated-by="," --fields-escaped-by="\\" --lines-terminated-by="\\n" store_locations stores
When the command runs, I get an error:
Can't create/write to file '/var/mi6/batch/stores.txt' (Errcode: 2) when executing 'SELECT INTO OUTFILE'
Now I have tried also outputting to the /tmp dir as suggested at http://techtots.blogspot.com/2011/12/using-mysqldump-to-export-csv-file.html and it is still unable to write the file as it tells me it already exists, even though it doesn't.
Bottom line is, I would like to be able run a script on server A that issues a mysql command for the db server and have that output file saved to server A in csv format.
FYI, I have also tried just running mysql and redirecting output to a file. This creates a tab file but you dont have much control over the output which so it wont really work either.
mysqldump in a --tab mode is a CLI tool for SELECT INTO OUTFILE. And the latter is normally supposed to be used to create a delimited file afresh and only on the db server host.
SELECT ... INTO Syntax
The SELECT ... INTO OUTFILE statement is intended primarily to let you
very quickly dump a table to a text file on the server machine. If you
want to create the resulting file on some other host than the server
host, you normally cannot use SELECT ... INTO OUTFILE since there is
no way to write a path to the file relative to the server host's file
system.
You have at least following options:
use mysql instead of mysqldump on a remote host to create a tab delimited file instead
mysql -h<host> -u<user> -p<password> \
-e "SELECT 'column_name', 'column_name2'... \
UNION ALL SELECT column1, column2, FROM stores" > \
/path/to/your/file/file_name
you can pipe it with sed or awk and create a CSV file from a tab delimited output. See this for details
you can make a location for a file on a remote host accessible through network-mapped path on db server's file system.