How to avoid memory allocations in custom Julia iterators? - debugging
Consider the following Julia "compound" iterator: it merges two iterators, a and b,
each of which are assumed to be sorted according to order, to a single ordered
sequence:
struct MergeSorted{T,A,B,O}
a::A
b::B
order::O
MergeSorted(a::A, b::B, order::O=Base.Order.Forward) where {A,B,O} =
new{promote_type(eltype(A),eltype(B)),A,B,O}(a, b, order)
end
Base.eltype(::Type{MergeSorted{T,A,B,O}}) where {T,A,B,O} = T
#inline function Base.iterate(self::MergeSorted{T},
state=(iterate(self.a), iterate(self.b))) where T
a_result, b_result = state
if b_result === nothing
a_result === nothing && return nothing
a_curr, a_state = a_result
return T(a_curr), (iterate(self.a, a_state), b_result)
end
b_curr, b_state = b_result
if a_result !== nothing
a_curr, a_state = a_result
Base.Order.lt(self.order, a_curr, b_curr) &&
return T(a_curr), (iterate(self.a, a_state), b_result)
end
return T(b_curr), (a_result, iterate(self.b, b_state))
end
This code works, but is type-instable since the Julia iteration facilities are inherently so. For most cases, the compiler can work this out automatically, however, here it does not work: the following test code illustrates that temporaries are created:
>>> x = MergeSorted([1,4,5,9,32,44], [0,7,9,24,134]);
>>> sum(x);
>>> #time sum(x);
0.000013 seconds (61 allocations: 2.312 KiB)
Note the allocation count.
Is there any way to efficiently debug such situations other than playing around with the code and hoping that the compiler will be able to optimize out the type ambiguities? Does anyone know there any solution in this specific case that does not create temporaries?
How to diagnose the problem?
Answer: use #code_warntype
Run:
julia> #code_warntype iterate(x, iterate(x)[2])
Variables
#self#::Core.Const(iterate)
self::MergeSorted{Int64, Vector{Int64}, Vector{Int64}, Base.Order.ForwardOrdering}
state::Tuple{Tuple{Int64, Int64}, Tuple{Int64, Int64}}
#_4::Int64
#_5::Int64
#_6::Union{}
#_7::Int64
b_state::Int64
b_curr::Int64
a_state::Int64
a_curr::Int64
b_result::Tuple{Int64, Int64}
a_result::Tuple{Int64, Int64}
Body::Tuple{Int64, Any}
1 ─ nothing
│ Core.NewvarNode(:(#_4))
│ Core.NewvarNode(:(#_5))
│ Core.NewvarNode(:(#_6))
│ Core.NewvarNode(:(b_state))
│ Core.NewvarNode(:(b_curr))
│ Core.NewvarNode(:(a_state))
│ Core.NewvarNode(:(a_curr))
│ %9 = Base.indexed_iterate(state, 1)::Core.PartialStruct(Tuple{Tuple{Int64, Int64}, Int64}, Any[Tuple{Int64, Int64}, Core.Const(2)])
│ (a_result = Core.getfield(%9, 1))
│ (#_7 = Core.getfield(%9, 2))
│ %12 = Base.indexed_iterate(state, 2, #_7::Core.Const(2))::Core.PartialStruct(Tuple{Tuple{Int64, Int64}, Int64}, Any[Tuple{Int64, Int64}, Core.Const(3)])
│ (b_result = Core.getfield(%12, 1))
│ %14 = (b_result === Main.nothing)::Core.Const(false)
└── goto #3 if not %14
2 ─ Core.Const(:(a_result === Main.nothing))
│ Core.Const(:(%16))
│ Core.Const(:(return Main.nothing))
│ Core.Const(:(Base.indexed_iterate(a_result, 1)))
│ Core.Const(:(a_curr = Core.getfield(%19, 1)))
│ Core.Const(:(#_6 = Core.getfield(%19, 2)))
│ Core.Const(:(Base.indexed_iterate(a_result, 2, #_6)))
│ Core.Const(:(a_state = Core.getfield(%22, 1)))
│ Core.Const(:(($(Expr(:static_parameter, 1)))(a_curr)))
│ Core.Const(:(Base.getproperty(self, :a)))
│ Core.Const(:(Main.iterate(%25, a_state)))
│ Core.Const(:(Core.tuple(%26, b_result)))
│ Core.Const(:(Core.tuple(%24, %27)))
└── Core.Const(:(return %28))
3 ┄ %30 = Base.indexed_iterate(b_result, 1)::Core.PartialStruct(Tuple{Int64, Int64}, Any[Int64, Core.Const(2)])
│ (b_curr = Core.getfield(%30, 1))
│ (#_5 = Core.getfield(%30, 2))
│ %33 = Base.indexed_iterate(b_result, 2, #_5::Core.Const(2))::Core.PartialStruct(Tuple{Int64, Int64}, Any[Int64, Core.Const(3)])
│ (b_state = Core.getfield(%33, 1))
│ %35 = (a_result !== Main.nothing)::Core.Const(true)
└── goto #6 if not %35
4 ─ %37 = Base.indexed_iterate(a_result, 1)::Core.PartialStruct(Tuple{Int64, Int64}, Any[Int64, Core.Const(2)])
│ (a_curr = Core.getfield(%37, 1))
│ (#_4 = Core.getfield(%37, 2))
│ %40 = Base.indexed_iterate(a_result, 2, #_4::Core.Const(2))::Core.PartialStruct(Tuple{Int64, Int64}, Any[Int64, Core.Const(3)])
│ (a_state = Core.getfield(%40, 1))
│ %42 = Base.Order::Core.Const(Base.Order)
│ %43 = Base.getproperty(%42, :lt)::Core.Const(Base.Order.lt)
│ %44 = Base.getproperty(self, :order)::Core.Const(Base.Order.ForwardOrdering())
│ %45 = a_curr::Int64
│ %46 = (%43)(%44, %45, b_curr)::Bool
└── goto #6 if not %46
5 ─ %48 = ($(Expr(:static_parameter, 1)))(a_curr)::Int64
│ %49 = Base.getproperty(self, :a)::Vector{Int64}
│ %50 = Main.iterate(%49, a_state)::Union{Nothing, Tuple{Int64, Int64}}
│ %51 = Core.tuple(%50, b_result)::Tuple{Union{Nothing, Tuple{Int64, Int64}}, Tuple{Int64, Int64}}
│ %52 = Core.tuple(%48, %51)::Tuple{Int64, Tuple{Union{Nothing, Tuple{Int64, Int64}}, Tuple{Int64, Int64}}}
└── return %52
6 ┄ %54 = ($(Expr(:static_parameter, 1)))(b_curr)::Int64
│ %55 = a_result::Tuple{Int64, Int64}
│ %56 = Base.getproperty(self, :b)::Vector{Int64}
│ %57 = Main.iterate(%56, b_state)::Union{Nothing, Tuple{Int64, Int64}}
│ %58 = Core.tuple(%55, %57)::Tuple{Tuple{Int64, Int64}, Union{Nothing, Tuple{Int64, Int64}}}
│ %59 = Core.tuple(%54, %58)::Tuple{Int64, Tuple{Tuple{Int64, Int64}, Union{Nothing, Tuple{Int64, Int64}}}}
└── return %59
and you see that there are too many types of return value, so Julia gives up specializing them (and just assumes the second element of return type is Any).
How to fix the problem?
Answer: reduce the number of return type options of iterate.
Here is a quick write up (I do not claim it is most terse and have not tested it extensively so there might be some bug, but it was simple enough to write quickly using your code to show how one could approach your problem; note that I use special branches when one of the collections is empty as then it should be faster to just iterate one collection):
struct MergeSorted{T,A,B,O,F1,F2}
a::A
b::B
order::O
fa::F1
fb::F2
function MergeSorted(a::A, b::B, order::O=Base.Order.Forward) where {A,B,O}
fa, fb = iterate(a), iterate(b)
F1 = typeof(fa)
F2 = typeof(fb)
new{promote_type(eltype(A),eltype(B)),A,B,O,F1,F2}(a, b, order, fa, fb)
end
end
Base.eltype(::Type{MergeSorted{T,A,B,O}}) where {T,A,B,O} = T
struct State{Ta, Tb}
a::Union{Nothing, Ta}
b::Union{Nothing, Tb}
end
function Base.iterate(self::MergeSorted{T,A,B,O,Nothing,Nothing}) where {T,A,B,O}
return nothing
end
function Base.iterate(self::MergeSorted{T,A,B,O,F1,Nothing}) where {T,A,B,O,F1}
return self.fa
end
function Base.iterate(self::MergeSorted{T,A,B,O,F1,Nothing}, state) where {T,A,B,O,F1}
return iterate(self.a, state)
end
function Base.iterate(self::MergeSorted{T,A,B,O,Nothing,F2}) where {T,A,B,O,F2}
return self.fb
end
function Base.iterate(self::MergeSorted{T,A,B,O,Nothing,F2}, state) where {T,A,B,O,F2}
return iterate(self.b, state)
end
#inline function Base.iterate(self::MergeSorted{T,A,B,O,F1,F2}) where {T,A,B,O,F1,F2}
a_result, b_result = self.fa, self.fb
return iterate(self, State{F1,F2}(a_result, b_result))
end
#inline function Base.iterate(self::MergeSorted{T,A,B,O,F1,F2},
state::State{F1,F2}) where {T,A,B,O,F1,F2}
a_result, b_result = state.a, state.b
if b_result === nothing
a_result === nothing && return nothing
a_curr, a_state = a_result
return T(a_curr), State{F1,F2}(iterate(self.a, a_state), b_result)
end
b_curr, b_state = b_result
if a_result !== nothing
a_curr, a_state = a_result
Base.Order.lt(self.order, a_curr, b_curr) &&
return T(a_curr), State{F1,F2}(iterate(self.a, a_state), b_result)
end
return T(b_curr), State{F1,F2}(a_result, iterate(self.b, b_state))
end
And now you have:
julia> x = MergeSorted([1,4,5,9,32,44], [0,7,9,24,134]);
julia> sum(x)
269
julia> #allocated sum(x)
0
julia> #code_warntype iterate(x, iterate(x)[2])
Variables
#self#::Core.Const(iterate)
self::MergeSorted{Int64, Vector{Int64}, Vector{Int64}, Base.Order.ForwardOrdering, Tuple{Int64, Int64}, Tuple{Int64, Int64}}
state::State{Tuple{Int64, Int64}, Tuple{Int64, Int64}}
#_4::Int64
#_5::Int64
#_6::Int64
b_state::Int64
b_curr::Int64
a_state::Int64
a_curr::Int64
b_result::Union{Nothing, Tuple{Int64, Int64}}
a_result::Union{Nothing, Tuple{Int64, Int64}}
Body::Union{Nothing, Tuple{Int64, State{Tuple{Int64, Int64}, Tuple{Int64, Int64}}}}
1 ─ nothing
│ Core.NewvarNode(:(#_4))
│ Core.NewvarNode(:(#_5))
│ Core.NewvarNode(:(#_6))
│ Core.NewvarNode(:(b_state))
│ Core.NewvarNode(:(b_curr))
│ Core.NewvarNode(:(a_state))
│ Core.NewvarNode(:(a_curr))
│ %9 = Base.getproperty(state, :a)::Union{Nothing, Tuple{Int64, Int64}}
│ %10 = Base.getproperty(state, :b)::Union{Nothing, Tuple{Int64, Int64}}
│ (a_result = %9)
│ (b_result = %10)
│ %13 = (b_result === Main.nothing)::Bool
└── goto #5 if not %13
2 ─ %15 = (a_result === Main.nothing)::Bool
└── goto #4 if not %15
3 ─ return Main.nothing
4 ─ %18 = Base.indexed_iterate(a_result::Tuple{Int64, Int64}, 1)::Core.PartialStruct(Tuple{Int64, Int64}, Any[Int64, Core.Const(2)])
│ (a_curr = Core.getfield(%18, 1))
│ (#_6 = Core.getfield(%18, 2))
│ %21 = Base.indexed_iterate(a_result::Tuple{Int64, Int64}, 2, #_6::Core.Const(2))::Core.PartialStruct(Tuple{Int64, Int64}, Any[Int64, Core.Const(3)])
│ (a_state = Core.getfield(%21, 1))
│ %23 = ($(Expr(:static_parameter, 1)))(a_curr)::Int64
│ %24 = Core.apply_type(Main.State, $(Expr(:static_parameter, 5)), $(Expr(:static_parameter, 6)))::Core.Const(State{Tuple{Int64, Int64}, Tuple{Int64, Int64}})
│ %25 = Base.getproperty(self, :a)::Vector{Int64}
│ %26 = Main.iterate(%25, a_state)::Union{Nothing, Tuple{Int64, Int64}}
│ %27 = (%24)(%26, b_result::Core.Const(nothing))::State{Tuple{Int64, Int64}, Tuple{Int64, Int64}}
│ %28 = Core.tuple(%23, %27)::Tuple{Int64, State{Tuple{Int64, Int64}, Tuple{Int64, Int64}}}
└── return %28
5 ─ %30 = Base.indexed_iterate(b_result::Tuple{Int64, Int64}, 1)::Core.PartialStruct(Tuple{Int64, Int64}, Any[Int64, Core.Const(2)])
│ (b_curr = Core.getfield(%30, 1))
│ (#_5 = Core.getfield(%30, 2))
│ %33 = Base.indexed_iterate(b_result::Tuple{Int64, Int64}, 2, #_5::Core.Const(2))::Core.PartialStruct(Tuple{Int64, Int64}, Any[Int64, Core.Const(3)])
│ (b_state = Core.getfield(%33, 1))
│ %35 = (a_result !== Main.nothing)::Bool
└── goto #8 if not %35
6 ─ %37 = Base.indexed_iterate(a_result::Tuple{Int64, Int64}, 1)::Core.PartialStruct(Tuple{Int64, Int64}, Any[Int64, Core.Const(2)])
│ (a_curr = Core.getfield(%37, 1))
│ (#_4 = Core.getfield(%37, 2))
│ %40 = Base.indexed_iterate(a_result::Tuple{Int64, Int64}, 2, #_4::Core.Const(2))::Core.PartialStruct(Tuple{Int64, Int64}, Any[Int64, Core.Const(3)])
│ (a_state = Core.getfield(%40, 1))
│ %42 = Base.Order::Core.Const(Base.Order)
│ %43 = Base.getproperty(%42, :lt)::Core.Const(Base.Order.lt)
│ %44 = Base.getproperty(self, :order)::Core.Const(Base.Order.ForwardOrdering())
│ %45 = a_curr::Int64
│ %46 = (%43)(%44, %45, b_curr)::Bool
└── goto #8 if not %46
7 ─ %48 = ($(Expr(:static_parameter, 1)))(a_curr)::Int64
│ %49 = Core.apply_type(Main.State, $(Expr(:static_parameter, 5)), $(Expr(:static_parameter, 6)))::Core.Const(State{Tuple{Int64, Int64}, Tuple{Int64, Int64}})
│ %50 = Base.getproperty(self, :a)::Vector{Int64}
│ %51 = Main.iterate(%50, a_state)::Union{Nothing, Tuple{Int64, Int64}}
│ %52 = (%49)(%51, b_result::Tuple{Int64, Int64})::State{Tuple{Int64, Int64}, Tuple{Int64, Int64}}
│ %53 = Core.tuple(%48, %52)::Tuple{Int64, State{Tuple{Int64, Int64}, Tuple{Int64, Int64}}}
└── return %53
8 ┄ %55 = ($(Expr(:static_parameter, 1)))(b_curr)::Int64
│ %56 = Core.apply_type(Main.State, $(Expr(:static_parameter, 5)), $(Expr(:static_parameter, 6)))::Core.Const(State{Tuple{Int64, Int64}, Tuple{Int64, Int64}})
│ %57 = a_result::Union{Nothing, Tuple{Int64, Int64}}
│ %58 = Base.getproperty(self, :b)::Vector{Int64}
│ %59 = Main.iterate(%58, b_state)::Union{Nothing, Tuple{Int64, Int64}}
│ %60 = (%56)(%57, %59)::State{Tuple{Int64, Int64}, Tuple{Int64, Int64}}
│ %61 = Core.tuple(%55, %60)::Tuple{Int64, State{Tuple{Int64, Int64}, Tuple{Int64, Int64}}}
└── return %61
EDIT: now I have realized that my implementation is not fully correct, as it assumes that the return value of iterate if it is not nothing is type stable (which it does not have to be). But if it is not type stable then compiler must allocate. So a fully correct solution would first check if iterate is type stable. If it is - use my solution, and if it is not - use e.g. your solution.
Related
How to check missing values in Clickhouse
I have a table that is filled with data every 15 minutes. I need to check that there is data for all days of the entire period. there is a time column in which the data is in the format yyyy-mm-dd hh:mm:ss i've found the start date and the last date with I found out that you can generate an array of dates from this interval (start and end dates) with which each line will be compared, and if there is no match, here it is the missing date. i've tried this: WITH dates_range AS (SELECT toDate(min(time)) AS start_date, toDate(max(time)) AS end_date FROM table) SELECT dates FROM ( SELECT arrayFlatten(arrayMap(x -> start_date + x, range(0, toUInt64(end_date - start_date)))) AS dates FROM dates_range ) LEFT JOIN ( SELECT toDate(time) AS date FROM table GROUP BY toDate(time) ) USING date WHERE date IS NULL; but it returns with Code: 10. DB::Exception: Not found column date in block. There are only columns: dates. (NOT_FOUND_COLUMN_IN_BLOCK) and I can't
You can also use WITH FILL modifier https://clickhouse.com/docs/en/sql-reference/statements/select/order-by/#order-by-expr-with-fill-modifier create table T ( time DateTime) engine=Memory as SELECT toDateTime('2020-01-01') + (((number * 60) * 24) * if((number % 33) = 0, 3, 1)) FROM numbers(550); SELECT * FROM ( SELECT toDate(time) AS t, count() AS c FROM T GROUP BY t ORDER BY t ASC WITH FILL ) WHERE c = 0 ┌──────────t─┬─c─┐ │ 2020-01-11 │ 0 │ │ 2020-01-13 │ 0 │ │ 2020-01-16 │ 0 │ │ 2020-01-18 │ 0 │ │ 2020-01-21 │ 0 │ │ 2020-01-23 │ 0 │ │ 2020-01-26 │ 0 │ └────────────┴───┘
create table T ( time DateTime) engine=Memory as SELECT toDateTime('2020-01-01') + (((number * 60) * 24) * if((number % 33) = 0, 3, 1)) FROM numbers(550); WITH (SELECT (toDate(min(time)), toDate(max(time))) FROM T) as x select date, sumIf(cnt, type=1) c1, sumIf(cnt, type=2) c2 from ( SELECT arrayJoin(arrayFlatten(arrayMap(x -> x.1 + x, range(0, toUInt64(x.2 - x.1+1))))) AS date, 2 type, 1 cnt union all SELECT toDate(time) AS date, 1 type, count() cnt FROM T GROUP BY toDate(time) ) group by date having c1 = 0 or c2 = 0; ┌───────date─┬─c1─┬─c2─┐ │ 2020-01-11 │ 0 │ 1 │ │ 2020-01-13 │ 0 │ 1 │ │ 2020-01-16 │ 0 │ 1 │ │ 2020-01-18 │ 0 │ 1 │ │ 2020-01-21 │ 0 │ 1 │ │ 2020-01-23 │ 0 │ 1 │ │ 2020-01-26 │ 0 │ 1 │ └────────────┴────┴────┘
create table T ( time DateTime) engine=Memory as SELECT toDateTime('2020-01-01') + (((number * 60) * 24) * if((number % 33) = 0, 3, 1)) FROM numbers(550); WITH (SELECT (toDate(min(time)), toDate(max(time))) FROM T) as x SELECT l.*, r.* FROM ( SELECT arrayJoin(arrayFlatten(arrayMap(x -> x.1 + x, range(0, toUInt64(x.2 - x.1+1))))) AS date) l LEFT JOIN ( SELECT toDate(time) AS date FROM T GROUP BY toDate(time) ) r USING date WHERE r.date IS NULL settings join_use_nulls = 1; ┌───────date─┬─r.date─┐ │ 2020-01-11 │ ᴺᵁᴸᴸ │ │ 2020-01-13 │ ᴺᵁᴸᴸ │ │ 2020-01-16 │ ᴺᵁᴸᴸ │ │ 2020-01-18 │ ᴺᵁᴸᴸ │ │ 2020-01-21 │ ᴺᵁᴸᴸ │ │ 2020-01-23 │ ᴺᵁᴸᴸ │ │ 2020-01-26 │ ᴺᵁᴸᴸ │ └────────────┴────────┘
create table T ( time DateTime) engine=Memory as SELECT toDateTime('2020-01-01') + (((number * 60) * 24) * if((number % 33) = 0, 3, 1)) FROM numbers(550); select b from ( SELECT b, ((b - any(b) OVER (ORDER BY b ASC ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING))) AS lag FROM ( SELECT toDate(time) AS b FROM T GROUP BY b ORDER BY b ASC )) where lag > 1 and lag < 10000 ┌──────────b─┐ │ 2020-01-12 │ │ 2020-01-14 │ │ 2020-01-17 │ │ 2020-01-19 │ │ 2020-01-22 │ │ 2020-01-24 │ │ 2020-01-27 │ └────────────┘
Join two datasets with key duplicates one by one
I need to join two datasets from e.g. left and right source to match values by some keys. Datasets can contain duplicates: ┌─key─┬─value──┬─source──┐ │ 1 │ val1 │ left │ │ 1 │ val1 │ left │ << duplicate from left source │ 1 │ val1 │ left │ << another duplicate from left source │ 1 │ val1 │ right │ │ 1 │ val1 │ right │ << duplicate from right source │ 2 │ val2 │ left │ │ 2 │ val3 │ right │ └─────┴────────┴─-----───┘ I cant use full join, it gives cartesian products of all duplicates. I am trying to use group by instead: select `key`, anyIf(value, source = 'left') as left_value, anyIf(value, source = 'right') as right_value from test_raw group by key; It works good, but is there any way to match left and right duplicates? Expected result: ┌─key─┬─left_value─┬─right_value─┐ │ 1 │ val1 │ val1 │ │ 1 │ val1 │ val1 │ │ 1 │ val1 │ │ │ 2 │ val2 │ val3 │ └─────┴────────────┴─────────────┘ Scripts to reproduce: create table test_raw (`key` Int64,`value` String,`source` String) ENGINE = Memory; insert into test_raw (`key`,`value`,`source`) values (1, 'val1', 'left'), (1, 'val1', 'left'), (1, 'val1', 'left'), (1, 'val1', 'right'), (1, 'val1', 'right'), (2, 'val2', 'left'), (2, 'val3', 'right'); select `key`, anyIf(value, source = 'left') as left_value, anyIf(value, source = 'right') as right_value from test_raw group by key;
SELECT key, left_value, right_value FROM ( SELECT key, arraySort(groupArrayIf(value, source = 'left')) AS l, arraySort(groupArrayIf(value, source = 'right')) AS r, arrayMap(i -> (l[i + 1], r[i + 1]), range(greatest(length(l), length(r)))) AS t FROM test_raw GROUP BY key ) ARRAY JOIN t.1 AS left_value, t.2 AS right_value ORDER BY key ASC ┌─key─┬─left_value─┬─right_value─┐ │ 1 │ val1 │ val1 │ │ 1 │ val1 │ val1 │ │ 1 │ val1 │ │ │ 1 │ val1 │ │ │ 2 │ val2 │ val3 │ └─────┴────────────┴─────────────┘
Clickhouse - How can I get distinct values from all values inside an array type column
On a clickhouse database, I've an array type as column and I want to make an distinct for all elements inside them Instead of getting this Select distinct errors.message_grouping_fingerprint FROM views WHERE (session_date >= toDate('2022-07-21')) and (session_date < toDate('2022-07-22')) and notEmpty(errors.message) = 1 and project_id = 162 SETTINGS distributed_group_by_no_merge=0 [-8964675922652096680,-8964675922652096680] [-8964675922652096680] [-8964675922652096680,-8964675922652096680,-8964675922652096680,-8964675922652096680,-8964675922652096680,-8964675922652096680,-8964675922652096680,-827009490898812590,-8964675922652096680,-8964675922652096680,-8964675922652096680,-8964675922652096680] [-8964675922652096680,-8964675922652096680,-8964675922652096680] [-827009490898812590] [-1660275624223727714,-1660275624223727714] [1852265010681444046] [-2552644061611887546] [-7142229185866234523] [-7142229185866234523,-7142229185866234523] To get this -8964675922652096680 -827009490898812590 -1660275624223727714 1852265010681444046 -2552644061611887546 -7142229185866234523 and finally, to make a count of all them as 6
groupUniqArrayArray select arrayMap( i-> rand()%10, range(rand()%3+1)) arr from numbers(10); ┌─arr─────┐ │ [0] │ │ [1] │ │ [7,7,7] │ │ [8,8] │ │ [9,9,9] │ │ [6,6,6] │ │ [2,2] │ │ [8,8,8] │ │ [2] │ │ [8,8,8] │ └─────────┘ SELECT groupUniqArrayArray(arr) AS uarr, length(uarr) FROM ( SELECT arrayMap(i -> (rand() % 10), range((rand() % 3) + 1)) AS arr FROM numbers(10) ) ┌─uarr──────────────┬─length(groupUniqArrayArray(arr))─┐ │ [0,5,9,4,2,8,7,3] │ 8 │ └───────────────────┴──────────────────────────────────┘ ARRAY JOIN SELECT A FROM ( SELECT arrayMap(i -> (rand() % 10), range((rand() % 3) + 1)) AS arr FROM numbers(10) ) ARRAY JOIN arr AS A GROUP BY A ┌─A─┐ │ 0 │ │ 1 │ │ 4 │ │ 5 │ │ 6 │ │ 9 │ └───┘
How to extract ascending subsets from the sequence?
I have some data: ┌─id--┬─serial┐ │ 1 │ 1 │ │ 2 │ 2 │ │ 3 │ 3 │ │ 4 │ 1 │ │ 5 │ 3 │ │ 6 │ 2 │ │ 7 │ 1 │ │ 8 │ 2 │ │ 9 │ 3 │ │ 10 │ 1 │ │ 11 │ 2 │ │ 12 │ 1 │ │ 13 │ 2 │ │ 14 │ 3 │ └─────┴───────┘ I want to group by column 'serial' where the group rule is: any ascending subset (like this, 1 -> 2 -> 3) is a group. I expect result: ┌─id--┬─serial┬─group─┐ │ 1 │ 1 │ 1 │ │ 2 │ 2 │ 1 │ │ 3 │ 3 │ 1 │ │ 4 │ 1 │ 2 │ │ 5 │ 3 │ 2 │ │ 6 │ 2 │ 3 │ │ 7 │ 1 │ 4 │ │ 8 │ 2 │ 4 │ │ 9 │ 3 │ 4 │ │ 10 │ 1 │ 5 │ │ 11 │ 2 │ 5 │ │ 12 │ 1 │ 6 │ │ 13 │ 2 │ 6 │ │ 14 │ 3 │ 6 │ └─────┴───────┴───────┘
If I right understand you wanna split the set into subsets with ascending trend. SELECT r.1 id, r.2 serial, r.3 AS group, arrayJoin(result) r FROM ( SELECT groupArray((id, serial)) sourceArray, /* find indexes where the ascending trend is broken */ arrayFilter(i -> (i = 1 OR sourceArray[i - 1].2 > sourceArray[i].2), arrayEnumerate(sourceArray)) trendBrokenIndexes, /* select all groups with ascending trend and assign them group-id */ arrayMap(i -> (i, arraySlice(sourceArray, trendBrokenIndexes[i], i < length(trendBrokenIndexes) ? trendBrokenIndexes[i+1] - trendBrokenIndexes[i] : null)), arrayEnumerate(trendBrokenIndexes)) groups, /* prepare the result */ arrayReduce('groupArrayArray', arrayMap(x -> arrayMap(y -> (y.1, y.2, x.1), x.2), groups)) result FROM ( /* source data */ SELECT arrayJoin([(1 , 1),(2 , 2),(3 , 3),(4 , 1),(5 , 3),(6 , 2),(7 , 1),(8 , 2),(9 , 3),(10, 1),(11, 2),(12, 1),(13, 2),(14, 3)]) a, a.1 id, a.2 serial ORDER BY id)) /* Result ┌─id─┬─serial─┬─group─┬─r────────┐ │ 1 │ 1 │ 1 │ (1,1,1) │ │ 2 │ 2 │ 1 │ (2,2,1) │ │ 3 │ 3 │ 1 │ (3,3,1) │ │ 4 │ 1 │ 2 │ (4,1,2) │ │ 5 │ 3 │ 2 │ (5,3,2) │ │ 6 │ 2 │ 3 │ (6,2,3) │ │ 7 │ 1 │ 4 │ (7,1,4) │ │ 8 │ 2 │ 4 │ (8,2,4) │ │ 9 │ 3 │ 4 │ (9,3,4) │ │ 10 │ 1 │ 5 │ (10,1,5) │ │ 11 │ 2 │ 5 │ (11,2,5) │ │ 12 │ 1 │ 6 │ (12,1,6) │ │ 13 │ 2 │ 6 │ (13,2,6) │ │ 14 │ 3 │ 6 │ (14,3,6) │ └────┴────────┴───────┴──────────┘ */
Matching sets of integer literals
I am searching for a fast way to check if an int is included in a constant, sparse set. Consider a unicode whitespace function: let white_space x = x = 0x0009 or x = 0x000A or x = 0x000B or x = 0x000C or x = 0x000D or x = 0x0020 or x = 0x0085 or x = 0x00A0 or x = 0x1680 or x = 0x2000 or x = 0x2001 or x = 0x2002 or x = 0x2003 or x = 0x2004 or x = 0x2005 or x = 0x2006 or x = 0x2007 or x = 0x2008 or x = 0x2009 or x = 0x200A or x = 0x2028 or x = 0x2029 or x = 0x202F or x = 0x205F or x = 0x3000 What ocamlopt generates looks like this: .L162: cmpq $19, %rax jne .L161 movq $3, %rax ret .align 4 .L161: cmpq $21, %rax jne .L160 movq $3, %rax ret .align 4 .L160: cmpq $23, %rax jne .L159 movq $3, %rax ret .align 4 ... I microbenchmarked this code using the following benchmark: let white_space x = x = 0x0009 || x = 0x000A || x = 0x000B || x = 0x000C || x = 0x000D || x = 0x0020 || x = 0x0085 || x = 0x00A0 || x = 0x1680 || x = 0x2000 || x = 0x2001 || x = 0x2002 || x = 0x2003 || x = 0x2004 || x = 0x2005 || x = 0x2006 || x = 0x2007 || x = 0x2008 || x = 0x2009 || x = 0x200A || x = 0x2028 || x = 0x2029 || x = 0x202F || x = 0x205F || x = 0x3000 open Core.Std open Core_bench.Std let ws = [| 0x0009 ;0x000A ;0x000B ;0x000C ;0x000D ;0x0020 ;0x0085 ;0x00A0 ;0x1680 ;0x2000 ;0x2001 ;0x2002 ;0x2003 ;0x2004 ;0x2005 ;0x2006 ;0x2007 ;0x2008 ;0x2009 ;0x200A ;0x2028 ;0x2029 ;0x202F ;0x205F ;0x3000 |] let rec range a b = if a >= b then [] else a :: range (a + 1) b let bench_space n = Bench.Test.create (fun() -> ignore ( white_space ws.(n) ) ) ~name:(Printf.sprintf "checking whitespace (%x)" (n)) let tests : Bench.Test.t list = List.map (range 0 (Array.length ws)) bench_space let () = tests |> Bench.make_command |> Command.run The benchmark yields: Estimated testing time 2.5s (25 benchmarks x 100ms). Change using -quota SECS. ┌──────────────────────────┬──────────┬────────────┐ │ Name │ Time/Run │ Percentage │ ├──────────────────────────┼──────────┼────────────┤ │ checking whitespace (0) │ 4.05ns │ 18.79% │ │ checking whitespace (1) │ 4.32ns │ 20.06% │ │ checking whitespace (2) │ 5.40ns │ 25.07% │ │ checking whitespace (3) │ 6.63ns │ 30.81% │ │ checking whitespace (4) │ 6.83ns │ 31.71% │ │ checking whitespace (5) │ 8.13ns │ 37.77% │ │ checking whitespace (6) │ 8.28ns │ 38.46% │ │ checking whitespace (7) │ 8.98ns │ 41.72% │ │ checking whitespace (8) │ 10.08ns │ 46.81% │ │ checking whitespace (9) │ 10.43ns │ 48.44% │ │ checking whitespace (a) │ 11.49ns │ 53.38% │ │ checking whitespace (b) │ 12.71ns │ 59.04% │ │ checking whitespace (c) │ 12.94ns │ 60.08% │ │ checking whitespace (d) │ 14.03ns │ 65.16% │ │ checking whitespace (e) │ 14.38ns │ 66.77% │ │ checking whitespace (f) │ 15.09ns │ 70.06% │ │ checking whitespace (10) │ 16.15ns │ 75.00% │ │ checking whitespace (11) │ 16.67ns │ 77.43% │ │ checking whitespace (12) │ 17.59ns │ 81.69% │ │ checking whitespace (13) │ 18.66ns │ 86.68% │ │ checking whitespace (14) │ 19.02ns │ 88.35% │ │ checking whitespace (15) │ 20.10ns │ 93.36% │ │ checking whitespace (16) │ 20.49ns │ 95.16% │ │ checking whitespace (17) │ 21.42ns │ 99.50% │ │ checking whitespace (18) │ 21.53ns │ 100.00% │ └──────────────────────────┴──────────┴────────────┘ So I am basically limited at around 100MB/s which is not too bad, but still around one order of magnitude slower than lexers of e.g. gcc. Since OCaml is a "you get what you ask for" language, I guess I cannot expect the compiler to optimize this, but is there a general technique that allows to improve this?
This is shorter and seems more constant time: let white_space2 = function | 0x0009 | 0x000A | 0x000B | 0x000C | 0x000D | 0x0020 | 0x0085 | 0x00A0 | 0x1680 | 0x2000 | 0x2001 | 0x2002 | 0x2003 | 0x2004 | 0x2005 | 0x2006 | 0x2007 | 0x2008 | 0x2009 | 0x200A | 0x2028 | 0x2029 | 0x202F | 0x205F | 0x3000 -> true | _ -> false Gives: ┌──────────────────────────┬──────────┬────────────┐ │ Name │ Time/Run │ Percentage │ ├──────────────────────────┼──────────┼────────────┤ │ checking whitespace (0) │ 5.98ns │ 99.76% │ │ checking whitespace (1) │ 5.98ns │ 99.76% │ │ checking whitespace (2) │ 5.98ns │ 99.77% │ │ checking whitespace (3) │ 5.98ns │ 99.78% │ │ checking whitespace (4) │ 6.00ns │ 100.00% │ │ checking whitespace (5) │ 5.44ns │ 90.69% │ │ checking whitespace (6) │ 4.89ns │ 81.62% │ │ checking whitespace (7) │ 4.89ns │ 81.62% │ │ checking whitespace (8) │ 4.90ns │ 81.63% │ │ checking whitespace (9) │ 5.44ns │ 90.68% │ │ checking whitespace (a) │ 5.44ns │ 90.70% │ │ checking whitespace (b) │ 5.44ns │ 90.67% │ │ checking whitespace (c) │ 5.44ns │ 90.67% │ │ checking whitespace (d) │ 5.44ns │ 90.69% │ │ checking whitespace (e) │ 5.44ns │ 90.69% │ │ checking whitespace (f) │ 5.44ns │ 90.69% │ │ checking whitespace (10) │ 5.44ns │ 90.73% │ │ checking whitespace (11) │ 5.44ns │ 90.69% │ │ checking whitespace (12) │ 5.44ns │ 90.71% │ │ checking whitespace (13) │ 5.44ns │ 90.69% │ │ checking whitespace (14) │ 4.90ns │ 81.67% │ │ checking whitespace (15) │ 4.89ns │ 81.61% │ │ checking whitespace (16) │ 4.62ns │ 77.08% │ │ checking whitespace (17) │ 5.17ns │ 86.14% │ │ checking whitespace (18) │ 4.62ns │ 77.09% │ └──────────────────────────┴──────────┴────────────┘