Cassandra querying a text value with space in shell - utf-8

I'm having an issue trying to querying my database, my script with cassandra-driver was this:
const query = 'CREATE TABLE IF NOT EXISTS test.RestaurantMenuItems ' +
'(id UUID, restaurantId varchar, menuName text, menuCategoryNames text, menuItemName text, menuItemDescription text, menuItemPrice decimal, PRIMARY KEY (id))';
return client.execute(query);
I have no idea how I could query with the spaces involved.
https://i.stack.imgur.com/0HU9b.png

With that schema you can only do selects on id, not restaurantId id. To satisfy that query C* would have to read the entire dataset from every node. If that is a query you will want to make your table would likely be something like:
CREATE TABLE IF NOT EXISTS test.items_by_restaurant (
restaurant_id varchar,
menu_name text,
menu_category_name text,
menu_item_name text,
menu_item_description text,
menu_item_price decimal,
PRIMARY KEY ((restaurantId), menu_name, menu_category_name, menu_item_name)
);

As per your screen shot, this is not an issue of text with space. If you add a space in a text it will be count as a character like others. As #Chris mentioned earlier, you are querying with a column which is not a partition key or not indexed. You need to use ALLOW FILTERING in your query to get data which is not recommended. Try creating index on the column you want to query with.
create index on restaurantmenuitems (restaurantid) ;
I am attaching a screen shot.

Related

find a best way to traverse oracle table

I have an oracle table. Table's DDL is (not have the primary key)
create table CLIENT_ACCOUNT
(
CLIENT_ID VARCHAR2(18) default ' ' not null,
ACCOUNT_ID VARCHAR2(18) default ' ' not null,
......
)
create unique index UK_ACCOUNT
on CLIENT_ACCOUNT (CLIENT_ID, ACCOUNT_ID)
Then, the data's scale is very huge, maybe 100M records. I want to traverse this whole table's data with batch.
Now, I use the table's index to batch traverse. But I have some oracle grammar problems.
# I want to use this SQL, but grammar error.
# try to use b-tree's index to locate start position, but not work
select * from CLIENT_ACCOUNT
WHERE (CLIENT_ID, ACCOUNT_ID) > (1,2)
AND ROWNUM < 1000
ORDER BY CLIENT_ID, ACCOUNT_ID
Has the fastest way to batch touch table data?
Wild guess:
select * from CLIENT_ACCOUNT
WHERE CLIENT_ID > '1'
and ACCOUNT_ID > '2'
AND ROWNUM < 1000;
It would at least compile, although whether it correctly implements your business logic is a different matter. Note that I have cast your filter criteria to strings. This is because your columns have a string datatype and you are defaulting them to spaces, so there's a high probability those columns contain non-numeric values.
If this doesn't solve your problem, please edit your question with more details; sample input data and expected output is always helpful in these situations.
Your data model seems odd.
Your columns are defined as varchar2. So why is your criteria numeric?
Also, why do you default the key columns to space? It would be better to leave unpopulated values as null. (To be clear, NULL is not a good thing in an indexed column, it's just better than a space.)

How to do the following query in Oracle NoSQL

I am planning to use NoSQL Cloud Service as our datastore. I have question about the MAP data type. Say I have a column “labels” ( labels MAP(RECORD(value STRING, contentType STRING)) in table “myTable”, which the “labels” column is MAP datatype and the value is RECORD data type .
I want to query the table which return all the rows that the key of the “labels” = particular value, what is the sql statement looks like? I tried:
select * from myTable where labels.keys($key=‘xxxx’)
which doesn’t work.
do we need to add the index for the label field in the MAP? any performance improvement? If yes, how to add this index?
Thanks
Please try the following syntax
select * from myTable t
where t.labels.keys() =any "xxx"
Your syntax is good if you add exists
select * from myTable t
where exists t.labels.keys($key= “xxx”)
Concerning your question about performance
there will be significant performance improvement.
If you want to index only the field names (keys) of the map,
you create the index like this:
create index idx_keys on myTable(labels.keys())
If you want to index both they keys and the associated values:
create index idx_keys_values
on myTable(labels.keys(), labels.values())

How to store weather station Details along with monitoring data efficiently?

I was following the TimeSeries data modelling in PlanetCassandra by Patrick McFadin. Regarding that, I had one query:
If I need to store the weather station name also, should it be in the same table, say:
create table test (wea_id int, wea_name text, wea_add text, eventday timeuuid, eventtime timeuuid, temp int, PRIMARY KEY ((wea_id, eventday), eventtime) );
This forces me to enter the wea_name and wea_add for each new row, so how to identify a new row has been created? Or is there any better mechanism for modeling the above data?
Regards,
Seenu.
I'm assuming you're referring to the article on getting started with time series data modeling at http://planetcassandra.org/getting-started-with-time-series-data-modeling/
The original CQL listed is:
CREATE TABLE temperature_by_day (
weatherstation_id text,
date text,
event_time timestamp,
temperature text,
PRIMARY KEY ((weatherstation_id,date),event_time)
);
If you need to add an attribute that's associated with the partition key, in this case (weatherstation_id,date), Cassandra has a feature that does just that: static columns
http://www.datastax.com/documentation/cql/3.1/cql/cql_reference/refStaticCol.html
So you could write the statment as this:
CREATE TABLE temperature_by_day (
weatherstation_id text,
weatherstation_name text STATIC,
date text,
event_time timestamp,
temperature text,
PRIMARY KEY ((weatherstation_id,date),event_time)
);
You would be storing the name once per (weatherstation_id,date) combination rather than for every observation.
Ideally you'd like to store a name once per weather station. Using this choice of partition key you can't do this; you could model with one device per partition as per Patrick's first example if you like:
CREATE TABLE temperature (
weatherstation_id text,
weatherstation_name text STATIC,
event_time timestamp,
temperature text,
PRIMARY KEY ((weatherstation_id),event_time)
);

HIVE order by messes up data

In Hive 0.8 with Hadoop 1.03 consider this table:
CREATE TABLE table (
key int,
date timestamp,
name string,
surname string,
height int,
weight int,
age int)
CLUSTERED BY(key) INTO 128 BUCKETS
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ',';
Then I tried:
select *
from table
where key=xxx
order by date;
The result is sorted but everything after the column name is wrong. In fact, all the rows have the exact same values in the respective fields and the surname column is missing. I also have a bitmap index on name and surname and an index on key.
Is there something wrong with my query or should I be looking into bugs about order by (I cant find anything specific).
Seems like there has been an error in loading data into hive. Make sure you don't have any special characters in your CSV File that might interfere with the insertion.
And you have clustered by the key property. Where does this key come from the CSV? or some other source? Are you sure that this is unique?

Sequence with variable

In SQL we will be having a sequence. But it should be appended to a variable like this
M1,M2,M3,M4....
Any way of doing this ?
Consider having the prefix stored in a separate column in the table, e.g.:
CREATE TABLE mytable (
idprefix VARCHAR2(1) NOT NULL,
id NUMBER NOT NULL,
CONSTRAINT mypk PRIMARY KEY (idprefix, id)
);
In the application, or in a view, you can concatenate the values together. Or, in 11g you can create a virtual column that concatenates them.
I give it 99% odds that someone will say "we want to search for ID 12345 regardless of the prefix" and this design means you can have a nice index lookup instead of a "LIKE '%12345'".
select 'M' || my_sequence.nextval from dual;

Resources