Using FOREACH caluse in Memgrpah 2.1

Using FOREACH caluse in Memgrpah 2.1 - memgraphdb

I'm running Memgrpah 2.1 (Platform docker image) and I can't get FOREACH caulse for me to work:
FOREACH ( a IN [1, 2, 3] | CREATE (n {age : a}))
MATCH (n) RETURN n.age
What am I missing?

FOREACH clause was added in Memgraph 2.3. You need to upgrade Memgraph version to be 2.3 or higher.

Related

PyArrow: Writing a parquet file with a particular schema

For testing purposes, I am trying to generate a file with dummy data, but with the following schema (schema of the real data):
pa.schema([
pa.field('field1', pa.int64()),
pa.field('field2', pa.list_(pa.field('element', pa.int64()))),
pa.field('field3', pa.list_(pa.field('element', pa.float64()))),
pa.field('field4', pa.list_(pa.field('element', pa.float64()))),
], )
I have the following code:
import pyarrow as pa
import pyarrow.parquet as pq
loc = "test.parquet"
data = {
"field1": [0],
"field2": [[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]],
"field3": [[1.0, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9]],
"field4": [[2.0, 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 2.9]]
}
schema1 = pa.schema([
pa.field('field1', pa.int64()),
pa.field('field2', pa.list_(pa.field('element', pa.int64()))),
pa.field('field3', pa.list_(pa.field('element', pa.float64()))),
pa.field('field4', pa.list_(pa.field('element', pa.float64()))),
], )
schema2 = pa.schema([
pa.field('field1', pa.int64()),
pa.field('field2', pa.list_(pa.int64())),
pa.field('field3', pa.list_(pa.float64())),
pa.field('field4', pa.list_(pa.float64())),
], )
writer = pq.ParquetWriter(loc, schema1)
writer.write(pa.table(data))
writer.close()
The dictionary in the code, when converted to a PyArrow table and written to a parquet file, generates a file whose schema matches schema2. Passing schema1 to the writer gives an error. How can I change the dictionary in such a way that its schema matches schema1 when converted to a table?

Semantically the schemas are the same, the name of the list item ("element") should not matter. This used to be an issue but has been fixed in pyarrow 11.0.0 (https://issues.apache.org/jira/browse/ARROW-14999)
So you can upgrade to pyarrow and it should work.
Alternatively you can make sure your table has got the correct schema by doing either:
writer.write(pa.table(data, schema=schema1))
Or casting by casting it:
writer.write(pa.table(data).cast(schema1))

Ruby - Unpack array with mixed types

I am trying to use unpack to decode a binary file. The binary file has the following structure:
ABCDEF\tFFFABCDEF\tFFFF....
where
ABCDEF -> String of fixed length
\t -> tab character
FFF -> 3 Floats
.... -> repeat thousands of times
I know how to do it when types are all the same or with only numbers and fixed length arrays, but I am struggling in this situation. For example, if I had a list of floats I would do
s.unpack('F*')
Or if I had integers and floats like
[1, 3.4, 5.2, 4, 2.3, 7.8]
I would do
s.unpack('CF2CF2')
But in this case I am a bit lost. I was hoping to use a format string such `(CF2)*' with brackets, but it does not work.
I need to use Ruby 2.0.0-p247 if that matters
Example
ary = ["ABCDEF\t", 3.4, 5.6, 9.1, "FEDCBA\t", 2.5, 8.9, 3.1]
s = ary.pack('P7fffP7fff')
then
s.scan(/.{19}/)
["\xA8lf\xF9\xD4\x7F\x00\x00\x9A\x99Y#33\xB3#\x9A\x99\x11", "A\x80lf\xF9\xD4\x7F\x00\x00\x00\x00 #ff\x0EAff"]
Finally
s.scan(/.{19}/).map{ |item| item.unpack('P7fff') }
Error: #<ArgumentError: no associated pointer>
<main>:in `unpack'
<main>:in `block in <main>'
<main>:in `map'
<main>:in `<main>'

You could read the file in small chunks of 19 bytes and use 'A7fff' to pack and unpack. Do not use pointers to structure ('p' and 'P'), as they need more than 19 bytes to encode your information.
You could also use 'A6xfff' to ignore the 7th byte and get a string with 6 chars.
Here's an example, which is similar to the documentation of IO.read:
data = [["ABCDEF\t", 3.4, 5.6, 9.1],
["FEDCBA\t", 2.5, 8.9, 3.1]]
binary_file = 'data.bin'
chunk_size = 19
pattern = 'A7fff'
File.open(binary_file, 'wb') do |o|
data.each do |row|
o.write row.pack(pattern)
end
end
raise "Something went wrong. Please check data, pattern and chunk_size." unless File.size(binary_file) == data.length * chunk_size
File.open(binary_file, 'rb') do |f|
while record = f.read(chunk_size)
puts '%s %g %g %g' % record.unpack(pattern)
end
end
# =>
# ABCDEF 3.4 5.6 9.1
# FEDCBA 2.5 8.9 3.1
You could use a multiple of 19 to speed up the process if your file is large.

When dealing with mixed formats that repeat, and are of a known fixed size, it is often easier to split the string first,
Quick example would be:
binary.scan(/.{LENGTH_OF_DATA}/).map { |item| item.unpack(FORMAT) }
Considering your above example, take the length of the string including the tab character (in bytes), plus the size of a 3 floats. If your strings are literally 'ABCDEF\t', you would use a size of 19 (7 for the string, 12 for the 3 floats).
Your final product would look like this:
str.scan(/.{19}/).map { |item| item.unpack('P7fff') }
Per example:
irb(main):001:0> ary = ["ABCDEF\t", 3.4, 5.6, 9.1, "FEDCBA\t", 2.5, 8.9, 3.1]
=> ["ABCDEF\t", 3.4, 5.6, 9.1, "FEDCBA\t", 2.5, 8.9, 3.1]
irb(main):002:0> s = ary.pack('pfffpfff')
=> "\xE8Pd\xE4eU\x00\x00\x9A\x99Y#33\xB3#\x9A\x99\x11A\x98Pd\xE4eU\x00\x00\x00\x00 #ff\x0EAffF#"
irb(main):003:0> s.unpack('pfffpfff')
=> ["ABCDEF\t", 3.4000000953674316, 5.599999904632568, 9.100000381469727, "FEDCBA\t", 2.5, 8.899999618530273, 3.0999999046325684]
The minor differences in precision is unavoidable, but do not worry about it, as it comes from the difference of a 32-bit float and 64-bit double (what Ruby used internally), and the precision difference will be less than is significant for a 32-bit float.

Find maximum of all data present in each columns in a pig table using pig

Input Format:
Year_2010 , Year_2009, Year_2008
1.2, 2.4, 3.5
3.4, 3.8, 5.7
4.5, 5.6, 3.4
3.7, 2.6, 4.8
I have tried the following script and it works for 1 column.
A = Load '/Year.csv' Using PigStorage(',') as (Year_2010:double,Year_2009:double,Year_2008:double);
B = group A ALL;
max = Foreach B generate group,MAX(A.Year_2010);
Expected Output:
Year_2010, Year_2009, Year_2008
4.5, 5.6, 5.7

Take a look at MAX.GROUP before applying the MAX on the columns.
A = Load '/Year.csv' Using PigStorage(',') as (Year_2010:double,Year_2009:double,Year_2008:double);
B = GROUP A ALL;
C = FOREACH B GENERATE MAX(A.Year_2010),MAX(A.Year_2009),MAX(A.Year_2008);
DUMP C;
Output:

Redis Sorted Set: Bulk ZSCORE

How to get a list of members based on their ID from a sorted set instead of just one member?
I would like to build a subset with a set of IDs from the actual sorted set.
I am using a Ruby client for Redis and do not want to iterate one by one. Because there could more than 3000 members that I want to lookup.
Here is the issue tracker to a new command ZMSCORE to do bulk ZSCORE.

There is no variadic form for ZSCORE, yet - see the discussion at: https://github.com/antirez/redis/issues/2344
That said, and for the time being, what you could do is use a Lua script for that. For example:
local scores = {}
while #ARGV > 0 do
scores[#scores+1] = redis.call('ZSCORE', KEYS[1], table.remove(ARGV, 1))
end
return scores
Running this from the command line would look like:
$ redis-cli ZADD foo 1 a 2 b 3 c 4 d
(integer) 4
$ redis-cli --eval mzscore.lua foo , b d
1) "2"
2) "4"
EDIT: In Ruby, it would probably be something like the following, although you'd be better off using SCRIPT LOAD and EVALSHA and loading the script from an external file (instead of hardcoding it in the app):
require 'redis'
script = <<LUA
local scores = {}
while #ARGV > 0 do
scores[#scores+1] = redis.call('ZSCORE', KEYS[1], table.remove(ARGV, 1))
end
return scores
LUA
redis = ::Redis.new()
reply = redis.eval(script, ["foo"], ["b", "d"])
Lua script to get scores with member IDs:
local scores = {}
while #ARGV > 0 do
local member_id = table.remove(ARGV, 1)
local member_score = {}
member_score[1] = member_id
member_score[2] = redis.call('ZSCORE', KEYS[1], member_id)
scores[#scores + 1] = member_score
end
return scores

How do I get 64 bit ids to work with Sphinx search server 0.9.9 on Mac OS?

I've been using Sphinx successfully for a while, but just ran into an issue that's got me confused... I back Sphinx with mysql queries and recently migrated my primary key strategy in a way that had the ids of the tables I'm indexing grow larger than 32 bits (in MYSQL they're bigint unsigned). Sphinx was getting index hits, but returning me nonsense ids (presumably 32 bits of the id returned by the queries or something)..
I looked into it, and realized I hadn't passed the --enable-id64 flag to ./configure. No problem, completely rebuilt sphinx with that flag (I'm running 0.9.9 by the way). No change though! I'm still experiencing the exact same issue. My test scenario is pretty simple:
MySQL:
create table test_sphinx(id bigint unsigned primary key, text varchar(200));
insert into test_sphinx values (10102374447, 'Girls Love Justin Beiber');
insert into test_sphinx values (500, 'But Small Ids are working?');
Sphinx conf:
source new_proof
{
type = mysql
sql_host = 127.0.0.1
sql_user = root
sql_pass = password
sql_db = testdb
sql_port =
sql_query_pre =
sql_query_post =
sql_query = SELECT id, text FROM test_sphinx
sql_query_info = SELECT * FROM `test_sphinx` WHERE `id` = $id
sql_attr_bigint = id
}
index new_proof
{
source = new_proof
path = /usr/local/sphinx/var/data/new_proof
docinfo = extern
morphology = none
stopwords =
min_word_len = 1
charset_type = utf-8
enable_star = 1
min_prefix_len = 0
min_infix_len = 2
}
Searching:
→ search -i new_proof beiber
Sphinx 0.9.9-release (r2117)
...
index 'new_proof': query 'beiber ': returned 1 matches of 1 total in 0.000 sec
displaying matches:
1. document=1512439855, weight=1
(document not found in db)
words:
1. 'beiber': 1 documents, 1 hits
→ search -i new_proof small
Sphinx 0.9.9-release (r2117)
...
index 'new_proof': query 'small ': returned 1 matches of 1 total in 0.000 sec
displaying matches:
1. document=500, weight=1
id=500
text=But Small Ids are working?
words:
1. 'small': 1 documents, 1 hits
Anyone have an idea about why this is broken?
Thanks in advance
-Phill
EDIT
Ah. Okay, got further. I didn't mention that I've been doing all of this testing on Mac OS. It looks like that may be my problem. I just compiled in 64 bit on linux and it works great.. There's also a clue when I run the Sphinx command line commands that the compile didn't take:
My Mac (broken)
Sphinx 0.9.9-release (r2117)
Linux box (working)
Sphinx 0.9.9-id64-release (r2117)
So I guess the new question is what's the trick to compiling for 64 bit keys on mac os?

Did you rebuild the index with 64 bits indexer?

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Using FOREACH caluse in Memgrpah 2.1 - memgraphdb

I'm running Memgrpah 2.1 (Platform docker image) and I can't get FOREACH caulse for me to work: FOREACH ( a IN [1, 2, 3] | CREATE (n {age : a})) MATCH (n) RETURN n.age What am I missing?

FOREACH clause was added in Memgraph 2.3. You need to upgrade Memgraph version to be 2.3 or higher.

Related

PyArrow: Writing a parquet file with a particular schema

Ruby - Unpack array with mixed types

Find maximum of all data present in each columns in a pig table using pig

Redis Sorted Set: Bulk ZSCORE

How do I get 64 bit ids to work with Sphinx search server 0.9.9 on Mac OS?

Categories

Resources