ProgrammingError: (1064, "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server - pymysql

I am trying to create an SQL table using python and PYMySQL package, but it keeps showing errors. my code is here:
import pymysql
conn = pymysql.connect(
host = 'database-1.czswegfdjhpn.us-east-1.rds.amazonaws.com',
port = int(3306),
user = 'admin',
password = '',
cursorclass=pymysql.cursors.DictCursor
)
cursor = conn.cursor()
print('connected succussfully')
cursor.execute('create database covid_19')
cursor.execute('use covid_19')
#creat table
sql_query = '''
CREATE TABLE covid_19_world_cases_deaths_testing (
iso_code varchar, continent varchar, location varchar, date varchar, total_cases float, \
new_cases float, new_cases_smoothed float, total_deaths float, new_deaths float,\
new_deaths_smoothed float, total_cases_per_million float, new_cases_per_million float,\
new_cases_smoothed_per_million float, total_deaths_per_million float, new_deaths_per_million float,\
new_deaths_smoothed_per_million float, reproduction_rate float, icu_patients float,\
icu_patients_per_million float, hosp_patients float, hosp_patients_per_million float,\
weekly_icu_admissions float, weekly_icu_admissions_per_million float, weekly_hosp_admissions float,\
weekly_hosp_admissions_per_million float, total_tests float, new_tests float, total_tests_per_thousand float,\
new_tests_per_thousand float, new_tests_smoothed float, new_tests_smoothed_per_thousand float, positive_rate float,\
tests_per_case float, tests_units varchar, total_vaccinations float, people_vaccinated float, people_fully_vaccinated float,\
total_boosters float, new_vaccinations float, new_vaccinations_smoothed float, total_vaccinations_per_hundred float,\
people_vaccinated_per_hundred float, people_fully_vaccinated_per_hundred float, total_boosters_per_hundred float,\
new_vaccinations_smoothed_per_million float, new_people_vaccinated_smoothed float, new_people_vaccinated_smoothed_per_hundred float,\
stringency_index float, population float, population_density float, median_age float, aged_65_older float,\
aged_70_older float, gdp_per_capita float, extreme_poverty float, cardiovasc_death_rate float, diabetes_prevalence float,\
female_smokers float, male_smokers float, handwashing_facilities float, hospital_beds_per_thousand float, life_expectancy float,\
human_development_index float, excess_mortality_cumulative_absolute float, excess_mortality_cumulative float, excess_mortality float,\
excess_mortality_cumulative_per_million float
)
'''
cursor.execute(sql_query)
ProgrammingError: (1064, "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ', continent varchar, location varchar, date varchar, total_cases float, \ \n n' at line 2")

Related

what is the following error in H2 database? And how to resolve the error?

When I am creating a table in H2 database, as following:
CREATE TABLE data2 (end_lat DOUBLE, end_lng DOUBLE, member_casual CHAR(7) unique,
month INTEGER, week INTEGER, day INTEGER, Year INTEGER,
day_of_week CHAR(10), tod TIME, ride_length INTEGER);
Then I am getting the error as below,
Syntax error in SQL statement "CREATE TABLE data2 (end_lat DOUBLE, end_lng DOUBLE, member_casual CHAR(7) unique, \000d\000a\0009\0009\0009\0009\0009[*]month INTEGER, week INTEGER, day INTEGER, Year INTEGER,\000d\000a\0009\0009\0009\0009\0009day_of_week CHAR(10), tod TIME, ride_length INTEGER)"; expected "identifier"; SQL statement:
CREATE TABLE data2 (end_lat DOUBLE, end_lng DOUBLE, member_casual CHAR(7) unique,
month INTEGER, week INTEGER, day INTEGER, Year INTEGER,
day_of_week CHAR(10), tod TIME, ride_length INTEGER) [42001-214] 42001/42001 (Help)
When I tried to create a table as following,
CREATE TABLE data2 (end_lat DOUBLE, end_lng DOUBLE, member_casual CHAR(7) unique,
month INTEGER, week INTEGER, day INTEGER, Year INTEGER,
day_of_week CHAR(10), tod TIME, ride_length INTEGER);
I was expecting it to create a table as it worked for me when I tried it in MySQL.
Appreciate it, if anyone can help me resolve this.
Answering to my own question
After changing the names of the column definition's to the following:
CREATE TABLE data2 (end_lat DOUBLE, end_lng DOUBLE, member_casual CHAR(7) not null unique, month_is INTEGER, week_is INT, day_is INTEGER, year_is INTEGER, day_of_week CHAR(10), tod TIME, ride_length INTEGER);
It Helped me resolve the error.

Non-string values showing as NULL in Hive

Im new to HIVE and creating my first table!
for some reason all non-string values are showing as NULL (including int, BOOLEAN, etc.)
my data looks like this sample row:
58;"management";"married";"tertiary";"no";2143;"yes";"no";"unknown";5;"may";261;1;-1;0;"unknown";"no"
i used this to create the table:
create external table bank_dataset(
age TINYINT,
job string,
education string,
default BOOLEAN,
balance INT,
housing BOOLEAN,
loan BOOLEAN,
contact STRING,
day STRING,
month STRING,
duration INT,
campaign INT,
pdays INT,
previous INT,
poutcome STRING,
y BOOLEAN)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\u003B'
STORED AS TEXTFILE
location '/user/marchenrisaad_gmail/Bank_Project'
tblproperties("skip.header.line.count"="1");
Thanks for the comments it worked! but i have 1 issue. For every row i get all the data correctly then I get extra columns of null values. Find below my code:
create external table bank_dataset(age TINYINT, job string, education string, default BOOLEAN, balance INT, housing BOOLEAN, loan BOOLEAN, contact STRING,day INT, month STRING, duration INT,campaign INT, pdays INT, previous INT, poutcome STRING,y BOOLEAN)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES (
"separatorChar" = "\u003B",
"quoteChar" = '"'
)
STORED AS TEXTFILE
location '/user/marchenrisaad_gmail/Bank_Project'
tblproperties("skip.header.line.count"="1");
Any suggestions?

Hive insert query failing with error return code -101

I am trying to run a simple insert statement as below:
insert into table `bwc_test` partition(call_date)
select * from
`bwc_master`;
Then it fails with the below error:
INFO : Loading data to table dtc.bwc_test partition (call_date=null) from /apps/hive/warehouse/dtc.db/bwc_test/.hive-staging_hive_2018-11-13_19-10-37_084_8697431764330812894-1/-ext-10000
Error: Error while processing statement: FAILED: Execution Error, return code -101 from org.apache.hadoop.hive.ql.exec.MoveTask. HIVE_LOAD_DYNAMIC_PARTITIONS_THREAD_COUNT (state=08S01,code=-101)
Table definition for bwc_master:
CREATE TABLE `bwc_master`(
unique_id bigint,
customer_id string,
direction string,
call_date_time timestamp,
duration int,
billed_duration int,
retail_rate decimal(9,7),
retail_cost decimal(19,7),
billed_tier smallint,
call_type tinyint,
record_status tinyint,
aggregate_id bigint,
originating_ipaddress string,
originating_number string,
destination_number string,
lrn string,
ocn string,
destination_rate_center string,
destination_lata int,
billed_prefix string,
rate_id string,
wholesale_rate decimal(9,7),
wholesale_cost decimal(19,7),
cnam_dipped boolean,
billed_number_type tinyint,
source_lata int,
source_ocn string,
location_id string,
sippeer_id int,
rate_attempts tinyint,
source_state string,
source_rc string,
destination_country string,
destination_state string,
destination_ip string,
carrier_id string,
rated_date_time timestamp,
partition_id smallint,
encryption_rate decimal(9,7),
encryption_cost decimal(19,7),
trans_coding_rate decimal(9,7),
trans_coding_cost decimal(19,7),
file_name string,
call_id string,
from_tag string,
to_tag string,
unique_record_id string)
PARTITIONED BY (
`call_date` date)
CLUSTERED BY (
customer_id)
INTO 10 BUCKETS
ROW FORMAT SERDE
'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
LOCATION
'hdfs://*****/apps/hive/warehouse/dtc.db/bwc_master'
Can someone help me debug this? I didn't find anything in the logs.
You missing the "table" before bwc_test
insert into table `bwc_test` partition(call_date)
select * from
`bwc_master`;

Parse Exception EOF Hive

Query:
hive> CREATE TABLE GREENTAXI(VendorID INT, pick_up_date DATE,drop_date DATE,Flag CHAR(1),rate_code INT, pick_up_long STRING,pick_up_lat STRING,drop_off_long STRING,drop_off_lat STRING,passenger_count INT,trip_distance DECIMAL,fare_amount DECIMAL,Extra DECIMAL,Tax DECIMAL,Tip DECIMAL,Tolls INT,Fee INT,Surcharge DECIMAL,total_amount DECIMAL,payment_type INT,trip_type INT)COMMENT 'Data about Green NYC Taxi for the year 2016-Jan’ ROW FORMAT DELIMITED FIELDS TERMINATED BY ','STORED AS TEXTFILE;
I get this error. Please advise
Looks like some character encoding problem. Use a simple editor. Tried this and worked:
CREATE TABLE greentaxi
(
vendorid INT,
pick_up_date DATE,
drop_date DATE,
flag CHAR(1),
rate_code INT,
pick_up_long STRING,
pick_up_lat STRING,
drop_off_long STRING,
drop_off_lat STRING,
passenger_count INT,
trip_distance DECIMAL,
fare_amount DECIMAL,
extra DECIMAL,
tax DECIMAL,
tip DECIMAL,
tolls INT,
fee INT,
surcharge DECIMAL,
total_amount DECIMAL,
payment_type INT,
trip_type INT
)
comment 'Data about Green NYC Taxi for the year 2016-Jan'
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS TEXTFILE;

How do I INSERT OVERWRITE with a struct in HIVE?

I have a Hive table tweets stored as text that I am trying to write to another table tweetsORC that is ORC. Both have the same structure:
col_name data_type comment
racist boolean from deserializer
contributors string from deserializer
coordinates string from deserializer
created_at string from deserializer
entities struct<hashtags:array<string>,symbols:array<string>,urls:array<struct<display_url:string,expanded_url:string,indices:array<tinyint>,url:string>>,user_mentions:array<string>> from deserializer
favorite_count tinyint from deserializer
favorited boolean from deserializer
filter_level string from deserializer
geo string from deserializer
id bigint from deserializer
id_str string from deserializer
in_reply_to_screen_name string from deserializer
in_reply_to_status_id string from deserializer
in_reply_to_status_id_str string from deserializer
in_reply_to_user_id string from deserializer
in_reply_to_user_id_str string from deserializer
is_quote_status boolean from deserializer
lang string from deserializer
place string from deserializer
possibly_sensitive boolean from deserializer
retweet_count tinyint from deserializer
retweeted boolean from deserializer
source string from deserializer
text string from deserializer
timestamp_ms string from deserializer
truncated boolean from deserializer
user struct<contributors_enabled:boolean,created_at:string,default_profile:boolean,default_profile_image:boolean,description:string,favourites_count:tinyint,follow_request_sent:string,followers_count:tinyint,following:string,friends_count:tinyint,geo_enabled:boolean,id:bigint,id_str:string,is_translator:boolean,lang:string,listed_count:tinyint,location:string,name:string,notifications:string,profile_background_color:string,profile_background_image_url:string,profile_background_image_url_https:string,profile_background_tile:boolean,profile_image_url:string,profile_image_url_https:string,profile_link_color:string,profile_sidebar_border_color:string,profile_sidebar_fill_color:string,profile_text_color:string,profile_use_background_image:boolean,protected:boolean,screen_name:string,statuses_count:smallint,time_zone:string,url:string,utc_offset:string,verified:boolean> from deserializer
When I try to insert from tweets to tweetsORC I get:
INSERT OVERWRITE TABLE tweetsORC SELECT * FROM tweets;
FAILED: NoMatchingMethodException No matching method for class org.apache.hadoop.hive.ql.udf.UDFToString with (struct<hashtags:array<string>,symbols:array<string>,urls:array<struct<display_url:string,expanded_url:string,indices:array<tinyint>,url:string>>,user_mentions:array<string>>). Possible choices: _FUNC_(bigint) _FUNC_(binary) _FUNC_(boolean) _FUNC_(date) _FUNC_(decimal(38,18)) _FUNC_(double) _FUNC_(float) _FUNC_(int) _FUNC_(smallint) _FUNC_(string) _FUNC_(timestamp) _FUNC_(tinyint) _FUNC_(void)
The only help I have found on this kind of problem says to make a UDF use primitive types, but I am not using a UDF! Any help is much appreciated!
FYI: Hive version:
Hive 1.2.1000.2.4.2.0-258
Subversion git://u12-slave-5708dfcd-10/grid/0/jenkins/workspace/HDP-build-ubuntu12/bigtop/output/hive/hive-1.2.1000.2.4.2.0 -r 240760457150036e13035cbb82bcda0c65362f3a
EDIT: Create tables and sample data:
create table tweets (
contributors string,
coordinates string,
created_at string,
entities struct <
hashtags: array <string>,
symbols: array <string>,
urls: array <struct <
display_url: string,
expanded_url: string,
indices: array <tinyint>,
url: string>>,
user_mentions: array <string>>,
favorite_count tinyint,
favorited boolean,
filter_level string,
geo string,
id bigint,
id_str string,
in_reply_to_screen_name string,
in_reply_to_status_id string,
in_reply_to_status_id_str string,
in_reply_to_user_id string,
in_reply_to_user_id_str string,
is_quote_status boolean,
lang string,
place string,
possibly_sensitive boolean,
retweet_count tinyint,
retweeted boolean,
source string,
text string,
timestamp_ms string,
truncated boolean,
`user` struct <
contributors_enabled: boolean,
created_at: string,
default_profile: boolean,
default_profile_image: boolean,
description: string,
favourites_count: tinyint,
follow_request_sent: string,
followers_count: tinyint,
`following`: string,
friends_count: tinyint,
geo_enabled: boolean,
id: bigint,
id_str: string,
is_translator: boolean,
lang: string,
listed_count: tinyint,
location: string,
name: string,
notifications: string,
profile_background_color: string,
profile_background_image_url: string,
profile_background_image_url_https: string,
profile_background_tile: boolean,
profile_image_url: string,
profile_image_url_https: string,
profile_link_color: string,
profile_sidebar_border_color: string,
profile_sidebar_fill_color: string,
profile_text_color: string,
profile_use_background_image: boolean,
protected: boolean,
screen_name: string,
statuses_count: smallint,
time_zone: string,
url: string,
utc_offset: string,
verified: boolean>
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
STORED AS TEXTFILE;
LOAD DATA LOCAL INPATH '/home/ed/Downloads/hive-json-master/1abbo.txt' OVERWRITE INTO TABLE tweets;
create table tweetsORC (
racist boolean,
contributors string,
coordinates string,
created_at string,
entities struct <
hashtags: array <string>,
symbols: array <string>,
urls: array <struct <
display_url: string,
expanded_url: string,
indices: array <tinyint>,
url: string>>,
user_mentions: array <string>>,
favorite_count tinyint,
favorited boolean,
filter_level string,
geo string,
id bigint,
id_str string,
in_reply_to_screen_name string,
in_reply_to_status_id string,
in_reply_to_status_id_str string,
in_reply_to_user_id string,
in_reply_to_user_id_str string,
is_quote_status boolean,
lang string,
place string,
possibly_sensitive boolean,
retweet_count tinyint,
retweeted boolean,
source string,
text string,
timestamp_ms string,
truncated boolean,
`user` struct <
contributors_enabled: boolean,
created_at: string,
default_profile: boolean,
default_profile_image: boolean,
description: string,
favourites_count: tinyint,
follow_request_sent: string,
followers_count: tinyint,
`following`: string,
friends_count: tinyint,
geo_enabled: boolean,
id: bigint,
id_str: string,
is_translator: boolean,
lang: string,
listed_count: tinyint,
location: string,
name: string,
notifications: string,
profile_background_color: string,
profile_background_image_url: string,
profile_background_image_url_https: string,
profile_background_tile: boolean,
profile_image_url: string,
profile_image_url_https: string,
profile_link_color: string,
profile_sidebar_border_color: string,
profile_sidebar_fill_color: string,
profile_text_color: string,
profile_use_background_image: boolean,
protected: boolean,
screen_name: string,
statuses_count: smallint,
time_zone: string,
url: string,
utc_offset: string,
verified: boolean>
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
STORED AS ORC tblproperties ("orc.compress"="ZLIB");
data here.
Instead of using Select * I list the fields by name and the error goes.
Data type mismatch: The data type you want to insert is inconsistent with the field type in the corresponding data table. For example, if the field type declared when you create the table is string, but the field type you inserted is indeed the list type, this error will be thrown.

Resources