How good suffix shift table should look like in Boyer-Moore? - algorithm

I wonder if I understand correctly the good suffix shift table?
Pattern = GTAGCGGCG
|Pattern| = 9
goodshift[0] : g 1
goodshift[1] : (c)g 2
goodshift[2] : (g)cg 8
goodshift[3] : (g)gcg 3
goodshift[4] : (c)ggcg 8
goodshift[5] : (g)cggcg 8
goodshift[6] : (a)gcggcg 8
goodshift[7] : (t)agcggcg 8
goodshift[8] : (g)tagcggcg 8
Pattern = ABCBAB
|Pattern| = 6
goodshift[0] : b 1
goodshift[1] : (a)b 2
goodshift[2] : (b)ab 4
goodshift[3] : (c)bab 4
goodshift[4] : (b)cbab 4
goodshift[5] : (a)bcbab 4

Related

Ilp - count number of times variables received a value

In an ILP, is it possible to have a variable whose value will be the number of variables with value N?
N is a bounded integer, with lower bound 1.
Thank you
This achieves the goal. It is written in pyomo but should be fairly easy to translate to other frameworks.
Code:
# magic number counter
import pyomo.environ as pyo
M = 100
magic_number=7
m = pyo.ConcreteModel()
m.I = pyo.Set(initialize=[1,2,3,4])
m.x = pyo.Var(m.I, domain=pyo.NonNegativeIntegers, bounds=(1, M))
m.magic = pyo.Var(m.I, domain=pyo.Binary)
# obj: max sum of x, plus some sugar for magic numbers's
m.obj = pyo.Objective(expr=sum(m.x[i] + 0.1*m.magic[i] for i in m.I), sense=pyo.maximize)
# constraints
m.sum_limit = pyo.Constraint(expr=pyo.sum_product(m.x) <= 19)
#m.Constraint(m.I)
def linking_1(m, i):
return m.x[i] <= magic_number + (1 - m.magic[i]) * M
#m.Constraint(m.I)
def linking_2(m, i):
return m.x[i] >= magic_number * m.magic[i]
solver = pyo.SolverFactory('glpk')
soln = solver.solve(m)
print(soln)
m.display()
print(f"\nmagic numbers {magic_number}'s produced: {pyo.value(pyo.sum_product(m.magic))}")
Output:
Problem:
- Name: unknown
Lower bound: 19.2
Upper bound: 19.2
Number of objectives: 1
Number of constraints: 10
Number of variables: 9
Number of nonzeros: 21
Sense: maximize
Solver:
- Status: ok
Termination condition: optimal
Statistics:
Branch and bound:
Number of bounded subproblems: 5
Number of created subproblems: 5
Error rc: 0
Time: 0.005478858947753906
Solution:
- number of solutions: 0
number of solutions displayed: 0
Model unknown
Variables:
x : Size=4, Index=I
Key : Lower : Value : Upper : Fixed : Stale : Domain
1 : 1 : 7.0 : 100 : False : False : NonNegativeIntegers
2 : 1 : 4.0 : 100 : False : False : NonNegativeIntegers
3 : 1 : 7.0 : 100 : False : False : NonNegativeIntegers
4 : 1 : 1.0 : 100 : False : False : NonNegativeIntegers
magic : Size=4, Index=I
Key : Lower : Value : Upper : Fixed : Stale : Domain
1 : 0 : 1.0 : 1 : False : False : Binary
2 : 0 : 0.0 : 1 : False : False : Binary
3 : 0 : 1.0 : 1 : False : False : Binary
4 : 0 : 0.0 : 1 : False : False : Binary
Objectives:
obj : Size=1, Index=None, Active=True
Key : Active : Value
None : True : 19.200000000000003
Constraints:
sum_limit : Size=1
Key : Lower : Body : Upper
None : None : 19.0 : 19.0
linking_1 : Size=4
Key : Lower : Body : Upper
1 : None : 0.0 : 0.0
2 : None : -103.0 : 0.0
3 : None : 0.0 : 0.0
4 : None : -106.0 : 0.0
linking_2 : Size=4
Key : Lower : Body : Upper
1 : None : 0.0 : 0.0
2 : None : -4.0 : 0.0
3 : None : 0.0 : 0.0
4 : None : -1.0 : 0.0
magic numbers 7's produced: 2.0

Incorrect values with concurrency example

I'm working through an ebook and here I am doing this code:
func f(n int) {
for i := 0; i < 10; i++ {
fmt.Println(n, ":", i)
amt := time.Duration(rand.Intn(250))
time.Sleep(time.Millisecond * amt)
}
}
func ConcurrMain() {
for i := 0; i < 10; i++ {
go f(i)
}
fmt.Println("hi")
}
The problem is I'm retrieving different values with every run command through goclipse. It seems to work better when i do not execute the sleep command.
In any case, any explanation with step by step i thought would be beneficial in my learning of concurrency
Here's an example of my return value with time sleep:
4 : 0
1 : 0
6 : 0
0 : 0
5 : 0
7 : 0
8 : 0
2 : 0
3 : 0
9 : 0
9 : 1
hi
Then I run it again and I now get with time sleep:
0 : 0
hi
1 : 0
And now a third time and I get with time sleep:
0 : 0
hi
And this time without sleep i get:
1 : 0
1 : 1
1 : 2
1 : 3
1 : 4
1 : 5
1 : 6
1 : 7
1 : 8
3 : 0
3 : 1
4 : 0
4 : 1
4 : 2
9 : 0
6 : 0
6 : 1
6 : 2
2 : 0
2 : 1
2 : 2
2 : 3
2 : 4
2 : 5
2 : 6
2 : 7
2 : 8
2 : 9
7 : 0
7 : 1
7 : 2
7 : 3
7 : 4
7 : 5
7 : 6
5 : 0
5 : 1
5 : 2
5 : 3
5 : 4
5 : 5
5 : 6
5 : 7
5 : 8
5 : 9
9 : 1
1 : 9
8 : 0
3 : 2
hi
I'm using goclipse under windows in case that information is pertinent
UPDATE
It should be noted that this call is called from the main package as:
func main()
{
lab.ConcurrMain()
}
UPDATE2
I added: var wg sync.WaitGroup outside the func scope. I also addedwg.Add(1)inside theConcurrMainmethod just prior togo f(i), then i also addedwg.Wait()just after theforloop which executes the go functionand then finallywg.Done()inside thef()` method **and that seems to work but that answer wasnt in the ebook so i want to understand.
In go, a program is only running for as long as it's main goroutine is running. That is, the goroutine that originally runs the main function, if this exits, the entire program exits.
In the example you've originally given (the code snippets) there's nothing to enforce when the goroutines that you spin up in ConcurrMain will be scheduled and run. Before you added the sync.WaitGroup the main goroutine can exit before the additional goroutines have been scheduled.
If the main goroutine exits, the entire program exits and the other goroutines will not be executed. The main goroutine will not wait for the other goroutines unless you specifically ask it to, with a wait group.

Selecting neighbours on a circle

Consider we have N points on a circle. To each point an index is assigned i = (1,2,...,N). Now, for a randomly selected point, I want to have a vector including the indices of 5 points, [two left neighbors, the point itself, two right neighbors].
See the figure below.
Some sxamples are as follows:
N = 18;
selectedPointIdx = 4;
sequence = [2 3 4 5 6];
selectedPointIdx = 1
sequence = [17 18 1 2 3]
selectedPointIdx = 17
sequence = [15 16 17 18 1];
The conventional way to code this is considering the exceptions as if-else statements, as I did:
if ii == 1
lseq = [N-1 N ii ii+1 ii+2];
elseif ii == 2
lseq = [N ii-1 ii ii+1 ii+2];
elseif ii == N-1
lseq=[ii-2 ii-1 ii N 1];
elseif ii == N
lseq=[ii-2 ii-1 ii 1 2];
else
lseq=[ii-2 ii-1 ii ii+1 ii+2];
end
where ii is selectedPointIdx.
It is not efficient if I consider for instance 7 points instead of 5. What is a more efficient way?
How about this -
off = -2:2
out = mod((off + selectedPointIdx) + 17,18) + 1
For a window size of 7, edit off to -3:3.
It uses the strategy of subtracting 1 + modding + adding back 1 as also discussed here.
Sample run -
>> off = -2:2;
for selectedPointIdx = 1:18
disp(['For selectedPointIdx =',num2str(selectedPointIdx),' :'])
disp(mod((off + selectedPointIdx) + 17,18) + 1)
end
For selectedPointIdx =1 :
17 18 1 2 3
For selectedPointIdx =2 :
18 1 2 3 4
For selectedPointIdx =3 :
1 2 3 4 5
For selectedPointIdx =4 :
2 3 4 5 6
For selectedPointIdx =5 :
3 4 5 6 7
For selectedPointIdx =6 :
4 5 6 7 8
....
For selectedPointIdx =11 :
9 10 11 12 13
For selectedPointIdx =12 :
10 11 12 13 14
For selectedPointIdx =13 :
11 12 13 14 15
For selectedPointIdx =14 :
12 13 14 15 16
For selectedPointIdx =15 :
13 14 15 16 17
For selectedPointIdx =16 :
14 15 16 17 18
For selectedPointIdx =17 :
15 16 17 18 1
For selectedPointIdx =18 :
16 17 18 1 2
You can use modular arithmetic instead: Let p be the point among N points numbered 1 to N. Say you want m neighbors on each side, you can get them as follows:
(p - m - 1) mod N + 1
...
(p - 4) mod N + 1
(p - 3) mod N + 1
(p - 2) mod N + 1
p
(p + 1) mod N + 1
(p + 2) mod N + 1
(p + 3) mod N + 1
...
(p + m - 1) mod N + 1
Code:
N = 18;
p = 2;
m = 3;
for i = p - m : p + m
nb = mod((i - 1) , N) + 1;
disp(nb);
end
Run code here
I would like you to note that you might not necessarily improve performance by avoiding a if statement. A benchmark might be necessary to figure this out. However, this will only be significant if you are treating tens of thousands of numbers.

Libavcodec: How to tell end of access unit when decoding H.264 stream

I'm receiving H.264 video over RTP and decoding it with libavcodec. I'm unpackaging the NAL units from the RTP packets before feeding them to avcodec (including reassembling fragmentation units).
I'm trying to show effective decoding frame rate. I used to log the time after a successful decode video call where *got_picture_ptr is non-zero. So far this worked since I only ever got video where there was one slice per frame. But now I receive video where both I and P frames consist of 2 NAL units each, of types 5 and 1 respectively. Now when I feed the either slice of a frame, decode_video return that it got a picture, and the pAVFrame->coded_picture_number is increased from every slice.
How do I go about reliably finding the beginning or end of a video frame/picture/access unit?
I've dumped out a few NAL units from the stream and run them through h264_analyze from h264bitstream.
Output from h264_analyze on 4 NAL Units
!! Found NAL at offset 695262 (0xA9BDE), size 25 (0x0019)
==================== NAL ====================
forbidden_zero_bit : 0
nal_ref_idc : 1
nal_unit_type : 7 ( Sequence parameter set )
======= SPS =======
profile_idc : 66
constraint_set0_flag : 1
constraint_set1_flag : 1
constraint_set2_flag : 1
constraint_set3_flag : 0
reserved_zero_4bits : 0
level_idc : 32
seq_parameter_set_id : 0
chroma_format_idc : 0
residual_colour_transform_flag : 0
bit_depth_luma_minus8 : 0
bit_depth_chroma_minus8 : 0
qpprime_y_zero_transform_bypass_flag : 0
seq_scaling_matrix_present_flag : 0
log2_max_frame_num_minus4 : 12
pic_order_cnt_type : 2
log2_max_pic_order_cnt_lsb_minus4 : 0
delta_pic_order_always_zero_flag : 0
offset_for_non_ref_pic : 0
offset_for_top_to_bottom_field : 0
num_ref_frames_in_pic_order_cnt_cycle : 0
num_ref_frames : 1
gaps_in_frame_num_value_allowed_flag : 0
pic_width_in_mbs_minus1 : 79
pic_height_in_map_units_minus1 : 44
frame_mbs_only_flag : 1
mb_adaptive_frame_field_flag : 0
direct_8x8_inference_flag : 1
frame_cropping_flag : 0
frame_crop_left_offset : 0
frame_crop_right_offset : 0
frame_crop_top_offset : 0
frame_crop_bottom_offset : 0
vui_parameters_present_flag : 1
=== VUI ===
aspect_ratio_info_present_flag : 1
aspect_ratio_idc : 1
sar_width : 0
sar_height : 0
overscan_info_present_flag : 0
overscan_appropriate_flag : 0
video_signal_type_present_flag : 1
video_format : 5
video_full_range_flag : 1
colour_description_present_flag : 0
colour_primaries : 0
transfer_characteristics : 0
matrix_coefficients : 0
chroma_loc_info_present_flag : 0
chroma_sample_loc_type_top_field : 0
chroma_sample_loc_type_bottom_field : 0
timing_info_present_flag : 1
num_units_in_tick : 1
time_scale : 25
fixed_frame_rate_flag : 0
nal_hrd_parameters_present_flag : 0
vcl_hrd_parameters_present_flag : 0
low_delay_hrd_flag : 0
pic_struct_present_flag : 0
bitstream_restriction_flag : 1
motion_vectors_over_pic_boundaries_flag : 1
max_bytes_per_pic_denom : 0
max_bits_per_mb_denom : 0
log2_max_mv_length_horizontal : 6
log2_max_mv_length_vertical : 6
num_reorder_frames : 0
max_dec_frame_buffering : 1
=== HRD ===
cpb_cnt_minus1 : 0
bit_rate_scale : 0
cpb_size_scale : 0
initial_cpb_removal_delay_length_minus1 : 0
cpb_removal_delay_length_minus1 : 0
dpb_output_delay_length_minus1 : 0
time_offset_length : 0
!! Found NAL at offset 695290 (0xA9BFA), size 4 (0x0004)
==================== NAL ====================
forbidden_zero_bit : 0
nal_ref_idc : 1
nal_unit_type : 8 ( Picture parameter set )
======= PPS =======
pic_parameter_set_id : 0
seq_parameter_set_id : 0
entropy_coding_mode_flag : 0
pic_order_present_flag : 0
num_slice_groups_minus1 : 0
slice_group_map_type : 0
num_ref_idx_l0_active_minus1 : 0
num_ref_idx_l1_active_minus1 : 0
weighted_pred_flag : 0
weighted_bipred_idc : 0
pic_init_qp_minus26 : 3
pic_init_qs_minus26 : 0
chroma_qp_index_offset : 0
deblocking_filter_control_present_flag : 1
constrained_intra_pred_flag : 0
redundant_pic_cnt_present_flag : 0
transform_8x8_mode_flag : 1
pic_scaling_matrix_present_flag : 0
second_chroma_qp_index_offset : 1
!! Found NAL at offset 695297 (0xA9C01), size 50725 (0xC625)
==================== NAL ====================
forbidden_zero_bit : 0
nal_ref_idc : 1
nal_unit_type : 5 ( Coded slice of an IDR picture )
======= Slice Header =======
first_mb_in_slice : 0
slice_type : 2 ( I slice )
pic_parameter_set_id : 0
frame_num : 0
field_pic_flag : 0
bottom_field_flag : 0
idr_pic_id : 0
pic_order_cnt_lsb : 0
delta_pic_order_cnt_bottom : 0
redundant_pic_cnt : 0
direct_spatial_mv_pred_flag : 0
num_ref_idx_active_override_flag : 0
num_ref_idx_l0_active_minus1 : 0
num_ref_idx_l1_active_minus1 : 0
cabac_init_idc : 0
slice_qp_delta : 5
sp_for_switch_flag : 0
slice_qs_delta : 0
disable_deblocking_filter_idc : 0
slice_alpha_c0_offset_div2 : 0
slice_beta_offset_div2 : 0
slice_group_change_cycle : 0
=== Prediction Weight Table ===
luma_log2_weight_denom : 0
chroma_log2_weight_denom : 0
luma_weight_l0_flag : 0
chroma_weight_l0_flag : 0
luma_weight_l1_flag : 0
chroma_weight_l1_flag : 0
=== Ref Pic List Reordering ===
ref_pic_list_reordering_flag_l0 : 0
ref_pic_list_reordering_flag_l1 : 0
=== Decoded Ref Pic Marking ===
no_output_of_prior_pics_flag : 0
long_term_reference_flag : 0
adaptive_ref_pic_marking_mode_flag : 0
!! Found NAL at offset 746025 (0xB6229), size 38612 (0x96D4)
==================== NAL ====================
forbidden_zero_bit : 0
nal_ref_idc : 1
nal_unit_type : 5 ( Coded slice of an IDR picture )
======= Slice Header =======
first_mb_in_slice : 1840
slice_type : 2 ( I slice )
pic_parameter_set_id : 0
frame_num : 0
field_pic_flag : 0
bottom_field_flag : 0
idr_pic_id : 0
pic_order_cnt_lsb : 0
delta_pic_order_cnt_bottom : 0
redundant_pic_cnt : 0
direct_spatial_mv_pred_flag : 0
num_ref_idx_active_override_flag : 0
num_ref_idx_l0_active_minus1 : 0
num_ref_idx_l1_active_minus1 : 0
cabac_init_idc : 0
slice_qp_delta : 5
sp_for_switch_flag : 0
slice_qs_delta : 0
disable_deblocking_filter_idc : 0
slice_alpha_c0_offset_div2 : 0
slice_beta_offset_div2 : 0
slice_group_change_cycle : 0
=== Prediction Weight Table ===
luma_log2_weight_denom : 0
chroma_log2_weight_denom : 0
luma_weight_l0_flag : 0
chroma_weight_l0_flag : 0
luma_weight_l1_flag : 0
chroma_weight_l1_flag : 0
=== Ref Pic List Reordering ===
ref_pic_list_reordering_flag_l0 : 0
ref_pic_list_reordering_flag_l1 : 0
=== Decoded Ref Pic Marking ===
no_output_of_prior_pics_flag : 0
long_term_reference_flag : 0
adaptive_ref_pic_marking_mode_flag : 0
Both I slices show the frame_num = 0. The next 2 (not shown) have frame_num = 1.
What kind of packetization do you have with this H.264 stream? For example, with FU-A/FU-B fragmentation https://www.rfc-editor.org/rfc/rfc3984#page-11 you always can tell end of NAL unit since it's aligned with end of fragment marked as last fragment for current NALU.

sort vs sortrows in Matlab

Is it possible to achieve the same functionality with sort function than using sortrows. My matrix has over 4million+ rows and sortrows is bringing in a lot of latency because of iterations. (a vectorizated approach would be appreciated)
%Col1 -> date, Col2 -> id, Col3 -> ranking within each date-group (to help you debug)
data = [ ...
734614 5 3; 734615 6 5; 734622 1 1; 734615 1 1; 734615 4 3;
734622 2 2; 734622 4 3; 734615 3 2; 734615 5 4; 734614 3 2;
734614 1 1; 734622 8 4; 734622 9 5;] ;
sortedanswer =
734614 1 1
734614 3 2
734614 5 3
734615 1 1
734615 3 2
734615 4 3
734615 5 4
734615 6 5
734622 1 1
734622 2 2
734622 4 3
734622 8 4
734622 9 5
Thanks!
You could do it as
[~,indx]=sort(data(:,1));
sortedanswer=data(indx,:)
sortedanswer =
734614 5 3
734614 3 2
734614 1 1
734615 6 5
734615 1 1
734615 4 3
734615 3 2
734615 5 4
734622 1 1
734622 2 2
734622 4 3
734622 8 4
734622 9 5
Note that it is sorted by the rows in the first column. The order of the rows is the same as that in the original data, which is why you see 5 3 in the second and third columns in the first row in mine.

Resources