Unable to alter partition location in hive - hadoop

I am trying to change the partition location of my external hive table.
Command that I try to run:
ALTER TALBE sl_uploads PARTITION (hivetimestamp='2016-07-26 15:00:00') SET LOCATION '/data/dev/event/uploads/hivetimestamp=2016-07-26 15:00:00'
Error I get :
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.net.URISyntaxException: Illegal character in path
My data for a particular partition exists at the path:
/data/dev/event/uploads/hivetimestamp=date time/actual_data
I think space is creating an issue. But any help on this would be great.

your hdfs in path is right?
add /actual_data/?

Hive is unable to read the full hdfs path due to space in "2016-07-26 15:00:00";
you can use below commands;
hive> set part=2016-07-26 15:00:00;
hive>ALTER TALBE sl_uploads PARTITION (hivetimestamp='2016-07-26 15:00:00') SET LOCATION '/data/dev/event/uploads/hivetimestamp=#part';

First thing i saw that, you wrote TALBE instead of TABLE.

Related

MSCK repair table failing for schema tables

My hive table name is in the below format:
schema_name.hive_table_name
eg: schema1.abc;
Now when I try to do MSCK repair table on the above hive table it throws below error.
Logging initialized using configuration in file:/etc/hive/conf.dist/hive-log4j.properties
FAILED: ParseException line 1:28 missing EOF at '.' near 'schema_name'
Below is the command I used:
hive -e "MSCK repair table schema_name.hive_table_name"
Could any one help on this?
I tried the below statement:
hive -e "use schema_name;MSCK repair table hive_table_name"
This allows to add partition to hive with the specific schema mentioned .
It worked for me.
Thanks

Hadoop backend with millions of records insertion

I am new to hadoop, can someone please suggest me how to upload millions of records to hadoop? Can I do this with hive and where can I see my hadoop records?
Until now I have used hive for creation of the database on hadoop and I am accessing it with localhost 50070. But I am unable to load data from csv file to hadoop from terminal. As it is giving me error:
FAILED: Error in semantic analysis: Line 2:0 Invalid path ''/user/local/hadoop/share/hadoop/hdfs'': No files matching path hdfs://localhost:54310/usr/local/hadoop/share/hadoop/hdfs
Can anyone suggest me some way to resolve it?
I suppose initially the data is in the Local file system.
So a simple workflow could be: load data from local to hadoop file system(HDFS), create a hive table over it and then load the data in hive table.
Step 1:
// put in HDFS
$~ hadoop fs -put /local_path/file_pattern* /path/to/your/HDFS_directory
// check files
$~ hadoop fs -ls /path/to/your/HDFS_directory
Step 2:
CREATE EXTERNAL TABLE if not exists mytable (
Year int,
name string
)
row format delimited
fields terminated by ','
lines terminated by '\n'
stored as TEXTFILE;
// display table structure
describe mytable;
Step 3:
Load data local INPATH '/path/to/your/HDFS_directory'
OVERWRITE into TABLE mytable;
// simple hive statement to fetch top 10 records
SELECT * FROM mytable limit 10;
You should use LOAD DATA LOCAL INPATH <local-file-path> to load the files from local directory to Hive tables.
If you dont specify LOCAL , then load command will assume to lookup the given file path from HDFS location to load.
Please refer below link,
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-Loadingfilesintotables

Hive INSERT OVERWRITE to Google Storage as LOCAL DIRECTORY not working

I use the following Hive Query:
hive> INSERT OVERWRITE LOCAL DIRECTORY "gs:// Google/Storage/Directory/Path/Name" row format delimited fields terminated by ','
select * from <HiveDatabaseName>.<HiveTableName>;
I am getting the following error:
"Error: Failed with exception Wrong FS:"gs:// Google/Storage/Directory/PathName", expected: file:///
What could I be doing wrong?
Remove Local from your syntax.
See the below syntax
INSERT OVERWRITE DIRECTORY 'gs://Your_Bucket_Path/'
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LINES TERMINATED BY "\n"
SELECT * FROM YourExistingTable;
There's a bug in Hive, including IIRC Hive 1.2.1, where it uses the configured fs.default.name or fs.defaultFS for its scratchdir even if the table path is in a different filesystem. In your case, it appears you have the out-of-the-box defaults setting fs.defaultFS to file:///, which is why it says "expected: file:///". On a distributed Hadoop cluster, you might see it say "expected: hdfs://..." instead.
You can fix it within the single hive prompt by overriding fs.default.name and fs.defaultFS:
> set fs.default.name=gs://your-bucket/
> set fs.defaultFS=gs://your-bucket/
You may also want to modify those entries inside your core-site.xml file to point at your GCS location to make it easier.

create table in postgresql fails - tablespace issue

I am trying to create a the following table in postgresql
CREATE TABLE retail_demo.categories_dim_hawq
(
category_id integer NOT NULL,
category_name character varying(400) NOT NULL
)
WITH (appendonly=true, compresstype=quicklz) DISTRIBUTED RANDOMLY;
I am getting the following error:
ERROR: cannot get table space location for content 0 table space 1663
(catalog.c:97)
I tried to create a new tablespace, I got the following:
ERROR: syntax error at or near "LOCATION" LINE 1: create TABLESPACE
moha LOCATION "/tmp/abc";
Thanks in advance,
Moha.
I got the answer
you’ll need to create a filespace, tablespace, database, and then create the table to do this follow the following steps:
12. If you are on the default database (using plsql command), you can get out to root db user (gpadmin) using CTRL + D.
13. gpfilespace -o .
14. enter the name of the filespace: hawqfilespace3
15. Choose filesystem name for this filespace: hdfs
16. Enter replica num for filespace: 0
17. Specify the HDFS location for the segments: bigdata01.intrasoft.com.jo:8020/xd
Note that /xd is one of Hadoop directories which has read write access.
18. The system will generate a configuration command to you, just execute it.
19. Copy and paste the command and click on enter to execute it.
20. The file space is now created successfully.
21. Now connect to the Database using the psql command.
22. Now create a tablespace on the file space you created.
create TABLESPACE hawqtablespace3 FILESPACE hawqfilespace3;
23. Create a database on this tablespace using the command.
CREATE DATABASE hawqdatabase3 WITH OWNER gpadmin TEMPLATE=template0 TABLESPACE hawqtablespace3;
24. Now you need to connect to the database you created, but first click CTRL + D to exit the user you are in.
25. Enter the command psql hawqdatabase3

Hive error when creating an external table (state=08S01,code=1)

I'm trying to create an external table in Hive, but keep getting the following error:
create external table foobar (a STRING, b STRING) row format delimited fields terminated by "\t" stored as textfile location "/tmp/hive_test_1375711405.45852.txt";
Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask (state=08S01,code=1)
Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask (state=08S01,code=1)
Aborting command set because "force" is false and command failed: "create external table foobar (a STRING, b STRING) row format delimited fields terminated by "\t" stored as textfile location "/tmp/hive_test_1375711405.45852.txt";"
The contents of /tmp/hive_test_1375711405.45852.txt are:
abc\tdef
I'm connecting via the beeline command line interface, which uses Thrift HiveServer2.
System:
Hadoop 2.0.0-cdh4.3.0
Hive 0.10.0-cdh4.3.0
Beeline 0.10.0-cdh4.3.0
Client OS - Red Hat Enterprise Linux Server release 6.4 (Santiago)
The issue was that I was pointing the external table at a file in HDFS instead of a directory. The cryptic Hive error message really threw me off.
The solution is to create a directory and put the data file in there. To fix this for the above example, you'd create a directory under /tmp/foobar and place hive_test_1375711405.45852.txt in it. Then create the table like so:
create external table foobar (a STRING, b STRING) row format delimited fields terminated by "\t" stored as textfile location "/tmp/foobar";
We faced similar problem in our company (Sentry, hive, and kerberos combination). We solved it by removing all privileges from non fully defined hdfs_url. For example, we changed GRANT ALL ON URI '/user/test' TO ROLE test; to GRANT ALL ON URI 'hdfs-ha-name:///user/test' TO ROLE test;.
You can find the privileges for a specific URI in the Hive database (mysql in our case).

Resources