How can I create an external dictionary from url in clickhouse? - clickhouse

I'm trying to create a dictionary from http://standards-oui.ieee.org/oui/oui.csv
With this code:
CREATE DICTIONARY TestDict (
registry String DEFAULT '',
assignment String DEFAULT '',
name String DEFAULT '',
address String DEFAULT ''
)
PRIMARY KEY assignment
SOURCE(HTTP(
url 'http://standards-oui.ieee.org/oui/oui.csv'
format 'CSVWithNames'
))
LAYOUT(FLAT())
LIFETIME(300)
But when I try to select * from default.TestDict it returns error "Table default.testDict doesn't exist". And then dictionary status turns to "FAILED".
What am I doing wrong?

First let's look at dictionary last_exception-param:
SELECT *
FROM system.dictionaries
FORMAT Vertical
/*
..
last_exception: Code: 27, e.displayText() = DB::Exception: Cannot parse input: expected , before: MA-L,002272,American Micro-Fuel Device Corp.,2181 Buchanan Loop Ferndale WA US 98248 \r\nMA-L,00D0EF,IGT,9295 PROTOTYPE DRIVE RENO NV US 89511 \r\nMA-L,086195,Rockw: (at row 1)
Row 1:
Column 0, name: assignment, type: UInt64, ERROR: text "MA-L,00227" is not like UInt64
..
*/
The reason for the problem in the wrong selection of layout: FLAT-layout expects the numeric primary key with type UInt64 instead of String.
It needs to use composite key that support String-type and related COMPLEX_KEY_HASHED-layout.
CREATE DICTIONARY TestDict
(
registry String DEFAULT '',
assignment String DEFAULT '',
name String DEFAULT '',
address String DEFAULT ''
)
PRIMARY KEY registry, assignment, name
SOURCE(HTTP(URL 'http://standards-oui.ieee.org/oui/oui.csv' FORMAT CSVWithNames))
LIFETIME(MIN 0 MAX 300)
LAYOUT(COMPLEX_KEY_HASHED())
Take into account that primary key consists of three columns (registry, assignment, name) to uniquely identify each row.
SELECT count()
FROM default.TestDict
/* result
┌─count()─┐
│ 27742 │
└─────────┘
*/

Related

operator.itemgetter(*item) giving error TypeError: '<' not supported between instances of 'NoneType' and 'int'

I am using a recursive function for Grouping data list which I am getting by executing database query and the below function worked fine. But since yesterday it started throwing error as below:
in _group_data data_list.sort(key=itemgetter(*filter_key))
TypeError: '<' not supported between instances of 'NoneType' and 'int'
def _group_data(data_list, key_list, previous_key=[]):
"""
:param data_list: A list of dictionary got from executing sql query.
:param key_list: Group by list of keys.
:param previous_key: The earlier key to do grouping at the root level. Default is an empty list.
:return: Grouped Data list.
"""
# Base case
if len(key_list) == 1:
filter_key = key_list[0]
data_list.sort(key=itemgetter(*filter_key))
dl = list()
for _, group in groupby(data_list, key=itemgetter(*filter_key)):
d_dict = {'details': []}
for g in group:
if previous_key is not None:
d_dict['details'].append(
{key: value for key, value in g.items() if key not in (filter_key + previous_key)}
)
d_dict.update({key: value for key, value in g.items() if key in filter_key})
else:
d_dict['details'].append({key: value for key, value in g.items() if key not in filter_key})
d_dict.update({key: value for key, value in g.items() if key in filter_key})
dl.append(d_dict)
return dl
# Recursive block
else:
filter_key = key_list[0]
dl = list()
p_key = previous_key + filter_key
data_list.sort(key=itemgetter(*filter_key)) #getting error here
for _, group in groupby(data_list, key=itemgetter(*filter_key)):
group_list = list(group)
print('key_list[1:] ',key_list[1:])
print('p_key ',p_key)
d_list = _group_data(group_list, key_list[1:], p_key)
p_dict = {key: value for key, value in group_list[0].items() if key in filter_key}
p_dict['details'] = d_list
dl.append(p_dict)
return dl
I am trying to Identify the problem but not getting any clue, Also tried few things from the google search but nothing helped for example sorted() but got the same error. Thanks in advance for the help.
Issue was resolved. I was getting data as null for a key, but it was supposed to be integer that's why it was throwing error. so, from the dB side value of that key is set to 0 as a default.

Setting Linq2Db to read and update with CHAR type and NOT NULL constraint in Oracle

Reading value of fixed CHAR string from table, using Linq2db Oracle provider:
CREATE TABLE mytable
(pk NUMBER(15,0) NOT NULL,
fixed_data CHAR(20) DEFAULT ' ' NOT NULL)
Although in database, length of FIXED_DATA filed is 20,
SELECT LENGTH(fixed_data) FROM mytable WHERE pk = 1
-- result is 20
When same field is read using Linq2Db, value gets truncated to empty string:
var row = (from row in database.mytable where row.pk == 1 select row).ToList()[0];
Console.WriteLine(row.fixed_data.Length);
// result is zero
This causes problem when record is updated using Linq2Db, Oracle converts empty string to NULL, and UPDATE fails:
database.Update(row);
// Oracle.ManagedDataAccess.Client.OracleException: 'ORA-01407: cannot update ("MYSCHEMA"."MYTABLE"."FIXED_DATA") to NULL
Is there any setting in Linq2Db for read->update cycle to work with CHAR type and NOT NULL constraint?
Found a solution, thanks to source code openly available. By default, Linq2Db calls expression IDataReader.GetString(int).TrimEnd(' ') on every CHAR and NCHAR column. However this can be easily customized, by implementing custom provider, which overrides field value retrieval expression, with one that does not trim:
class MyOracleProvider : OracleDataProvider
{
public MyOracleProvider(string name)
: base(name)
{
// original is SetCharField("Char", (r,i) => r.GetString(i).TrimEnd(' '));
SetCharField("Char", (r, i) => r.GetString(i));
// original is SetCharField("NChar", (r,i) => r.GetString(i).TrimEnd(' '));
SetCharField("NChar", (r, i) => r.GetString(i));
}
}

Using FOR with a dynamic internal table?

I would like to convert the method below to a nested FOR instead of a nested LOOP, but I don't know how to do it since the inner table is dynamic (it can be one of 5 different types).
TYPES: BEGIN OF ty_result,
lgart TYPE string,
betrg TYPE string,
betpe TYPE string,
END OF ty_result,
ty_results TYPE STANDARD TABLE OF ty_result WITH EMPTY KEY.
DATA: known_table TYPE ty_results,
also_known_table TYPE ty_results,
mt_period_results TYPE ty_results.
FIELD-SYMBOLS: <dynamic_table> TYPE STANDARD TABLE,
<betrg>, <betpe>, <lgart>.
LOOP AT known_table REFERENCE INTO DATA(known_line).
READ TABLE <dynamic_table> TRANSPORTING NO FIELDS WITH KEY ('LGART') = known_line->*-lgart.
IF sy-subrc <> 0. CONTINUE. ENDIF.
DATA(lv_tabix) = sy-tabix.
LOOP AT <dynamic_table> ASSIGNING FIELD-SYMBOL(<dynamic_line>) FROM lv_tabix.
UNASSIGN: <betrg>, <betpe>, <lgart>.
ASSIGN COMPONENT: 'BETPE' OF STRUCTURE <dynamic_line> TO <betpe>,
'BETRG' OF STRUCTURE <dynamic_line> TO <betrg>,
'LGART' OF STRUCTURE <dynamic_line> TO <lgart>.
IF <lgart> <> known_line->*-lgart.
EXIT.
ENDIF.
APPEND VALUE ty_result( lgart = <lgart>
betrg = <betrg>
betpe = <betpe> ) TO mt_period_results.
ENDLOOP.
ENDLOOP.
When the inner table is not dynamic, I can do it like this:
append lines of value zwta_t_results(
for known_line in known_table
for also_known_line in also_known_table
where ( lgart = known_line-lgart )
( lgart = known_line-lgart
betrg = also_known_line-betrg
betpe = also_known_line-betpe ) to mt_period_results.
So the question is: is it possible to use FOR iterator (as the second method) with a dynamic table?
My answer was checked for ABAP 7.52. Unfortunately, it's currently only possible to use a subset of the static variant of ASSIGN by using LET <fs> = writable_expression IN inside a construction expression (including "FOR" table iterations), where the "writable expression" is limited to a table expression, NEW and CAST. So it's rather limited, there are no equivalences for the dynamic variants of ASSIGN, so you can use only workarounds.
The syntax after WHERE allows a dynamic expression, so it will be possible to enter WHERE ('LGART = KNOWN_LINE-LGART'). However, it could be very counter-performing if the loop is nested inside another loop (as it is in your case), so an index should be defined so that to accelerate the iteration. If a secondary index is to be used, then the condition should be USING KEY ('KEYNAME') WHERE ('LGART = KNOWN_LINE-LGART').
Now, here is a workaround for your particular case: you define statically the names of the components, so one possibility is to define a static structure with those component names and use the CORRESPONDING construction operator. Note that I didn't test it, but I think for several reasons that the performance of using CORRESPONDING is faster in your case than using ASSIGN.
The following code should work. I assume that the internal table behind <dynamic_table> has a primary key sorted by LGART (TYPE SORTED TABLE OF ... WITH NON-UNIQUE KEY lgart) so that the performance is good:
TYPES: BEGIN OF ty_struc,
lgart TYPE string,
betrg TYPE string,
betpe TYPE string,
END OF ty_struc.
known_table = VALUE #( ( lgart = 'A' ) ( lgart = 'B' ) ).
also_known_table = VALUE #( ( lgart = 'A' ) ( lgart = 'C' ) ( lgart = 'A' ) ).
ASSIGN also_known_table TO <dynamic_table>.
APPEND LINES OF
VALUE ty_results(
FOR known_line IN known_table
FOR <inner_line> IN <dynamic_table>
WHERE ('LGART = KNOWN_LINE-LGART')
LET struc = CORRESPONDING ty_struc( <inner_line> ) IN
( lgart = known_line-lgart
betrg = struc-betrg
betpe = struc-betpe ) )
TO mt_period_results.

Ruby - sqlite3, inserting a non ascii string into database

When trying to insert into the database a string such as :
"セブンゴースト; 神幻拍挡07;"
there is no error but it is read as nil after a select.
Exemple :
string = "セブンゴースト; 神幻拍挡07;"
db = SQLite3::Database.new "randomfilename"
prep = db.prepare "INSERT INTO test_db VALUES (NULL, ?)"
prep.bind_param 1, string
prep.execute
prep2 = db.prepare "SELECT * FROM test_db"
ret = prep2.execute
p ret
this will display something like this : [[0, nil]]
( assuming that the table has a primary int key as first value and a TEXT for second )
It is possible that my database does not have the good encoding but if it is the case, how do I change it without loosing everything ?
found the answer : I needed a NTEXT field, not a TEXT.
the NTEXT supports UTF-8 encoding
http://www.w3schools.com/sql/sql_datatypes.asp

Dashes Causing SQL Trouble in DBI

I have a SQL query with a WHERE clause that typically has values including a dash as stored in the database as a CHAR(10). When I explicitly call it like in the following:
$sth = $dbh->prepare("SELECT STATUS_CODE FROM MyTable WHERE ACC_TYPE = 'A-50C'");
It works and properly returns my 1 row; however if I do the following:
my $code = 'A-50C';
$sth = $dbh->prepare("SELECT STATUS_CODE FROM MyTable WHERE ACC_TYPE = ?");
$sth->execute($code);
or I do:
my $code = 'A-50C';
$sth = $dbh->prepare("SELECT STATUS_CODE FROM MyTable WHERE ACC_TYPE = ?");
$sth->bind_param(1, $code);
$sth->execute();
The query completes, but I get no results. I suspect it has to do with the dash being interpretted incorrectly, but I can't link it to a Perl issue as I have printed my $code variable using print "My Content: $code\n"; so I can confirm its not being strangely converted. I also tried including a third value for bind_param and if I specify something like ORA_VARCHAR2, SQL_VARCHAR (tried all possibilities) I still get no results. If I change it to the long form i.e. { TYPE => SQL_VARCHAR } it gives me an error of
DBI::st=HASH<0x232a210>->bind_param(...): attribute parameter
'SQL_VARCHAR' is not a hash ref
Lastly, I tried single and double quotes in different ways as well as back ticks to escape the values, but nothing got me the 1 row, only 0. Any ideas? Haven't found anything in documentation or searching. This is oracle for reference.
Code with error checking:
my $dbh = DBI->connect($dsn, $user, $pw, {PrintError => 0, RaiseError => 0})
or die "$DBI::errstr\n";
# my $dbh = DBI->connect(); # connect
my $code = 'A-50C';
print "My Content: $code\n";
$sth = $dbh->prepare( "SELECT COUNT(*) FROM MyTable WHERE CODE = ?" )
or die "Can't prepare SQL statement: $DBI::errstr\n";
$sth->bind_param(1, $code);
$sth->execute() or die "Can't execute SQL statement: $DBI::errstr\n";
my $outfile = 'output.txt';
open OUTFILE, '>', $outfile or die "Unable to open $outfile: $!";
while(my #re = $sth->fetchrow_array) {
print OUTFILE #re,"\n";
}
warn "Data fetching terminated early by error: $DBI::errstr\n"
if $DBI::err;
close OUTFILE;
$sth->finish();
$dbh->disconnect();
I ran a trace and got back:
-> bind_param for DBD::Oracle::st (DBI::st=HASH(0x22fbcc0)~0x3bcf48 2 'A-50C' HASH(0x22fbac8)) thr#3b66c8
dbd_bind_ph(1): bind :p2 <== 'A-50C' (type 0 (DEFAULT (varchar)), attribs: HASH(0x22fbac8))
dbd_rebind_ph_char() (1): bind :p2 <== 'A-50C' (size 5/16/0, ptype 4(VARCHAR), otype 1 )
dbd_rebind_ph_char() (2): bind :p2 <== ''A-50' (size 5/16, otype 1(VARCHAR), indp 0, at_exec 1)
bind :p2 as ftype 1 (VARCHAR)
dbd_rebind_ph(): bind :p2 <== 'A-50C' (in, not-utf8, csid 178->0->178, ftype 1 (VARCHAR), csform 0(0)->0(0), maxlen 16, maxdata_size 0)
Your problem is likely a result of comparing CHAR and VARCHAR data together.
The CHAR data type is notorious (and should be avoided), because it stores data in fixed-length format. It should never be used for holding varying-length data. In your case, data stored in the ACC_TYPE column will always take up 10 characters of storage. When you store a value whose length is less than the size of the column, like A-50C, the database will implicitly pad the string up to 10 characters, so the actual value stored becomes A-50C_____ (where _ represents a whitespace).
Your first query works because when you use a hard-code literal, Oracle will automatically right-pad the value for you (A-50C -> A-50C_____). However, in your second query where you use bind variables, you're comparing a VARCHAR against a CHAR and no auto-padding will happen.
As a quick fix to the problem, you could add right-padding to the query:
SELECT STATUS_CODE FROM MyTable WHERE ACC_TYPE = rpad(?, 10)
A long-term solution would be to avoid using the CHAR data type in your table definitions and switch to VARCHAR2 instead.
As your DBI_TRACE revealed, the ACC_TYPE column is CHAR(10) but the bound parameter is understood as VARCHAR.
When comparing CHAR, NCHAR, or literal strings to one another, trailing blanks are effectively ignored. (Remember, CHAR(10) means that ACC_TYPE values are padded to 10 characters long.) Thus, 'A ' and 'A', as CHARs, compare equal. When comparing with the VARCHAR family, however, trailing blanks become significant, and 'A ' no longer equals 'A' if one is a VARCHAR variant.
You can confirm in sqlplus or via quick DBI query:
SELECT COUNT(1) FROM DUAL WHERE CAST('A' AS CHAR(2)) = 'A'; -- or CAST AS CHAR(whatever)
SELECT COUNT(1) FROM DUAL WHERE CAST('A' AS CHAR(2)) = CAST('A' AS VARCHAR(1));
(Oracle terms these blank-padded and nonpadded comparison semantics, since the actual behavior per the ANSI-92 spec is to pad the shorter CHAR or literal to the length of the longer and then compare. The effective behavior, whatever the name, is that one ignores trailing blanks and the other does not.)
As #MickMnemonic suggested, RPAD()ing the bound value will work, or, better, altering the column type to VARCHAR. You could also CAST(? AS CHAR(10)). TRIM(TRAILING FROM ACC_TYPE) would work, too, at the cost of ignoring any index on that column.

Resources