Lets say im storing some ordered strings like this:
1 apple
2 banana
3 pear
4 mango
5 cantaloupe
Now I need to insert strawberry that should show up at position 4.
OFC I can easily do that by updating numeric index, ex:
1 apple
2 banana
3 pear
4 strawberry
5 mango
6 cantaloupe
But the issue is - if I need to store this position update in the database I now need to store 3 operations:
a) UPDATE index = 6, WHERE index=5
b) UPDATE index = 5, WHERE index=4
c) insert strawberry at position 4
Which is fine for small lists, but in large lists I would end up with a large number of position update operations.
Is there a more efficient approach? Maybe using something other than numbers?
Related
I have some input data like this.
unique ID
Q1
Q2
Q3
1
1
1
2
2
1
1
2
3
1
0
3
4
2
0
1
5
3
1
2
6
4
1
3
And my target is to extract some data which satisfy the following conditions:
total count: 4
Q1=1 count: 2
Q1=2 count: 1
Q2=1 count: 1~3
Q3=1 count: 1
In this case, both data set with ids [1, 2, 4, 5] or [2, 3, 4, 5] are acceptable answers.
In reality, I will possibly have 6000+ rows of data and up to 12 count limitation like above. The count might varies from 1 to 50.
I've written a solution which firstly group all ids by each condition, then use deapth first search to exhaustedly try out all possible combinations between the groups. (I believe this is a brute-force solution...)
However, I always run out my computer's memory and my time before I can get a possible answer.
My question is,
what's the possible least time complexity of this problem. (I believe this is kind of subset sum problem, but I am not sure)
how can I solve this problem instead of a brute-force one? I'm considering dynamic programming or decision tree. However, I believe that I will possibly run out of my computer's memory with either of this one. Or can I solve this problem by each data row's probabilities/entropy (and I would appreciate more details on this)?
My brute-force solution sample codes are not worth reading at all. Thus, I'll skip posting my code snippets...
I have 3 columns and 1000 rows of integers in .dat file and I have to plot it in the graph in the way that first column is on the x-axes and sqrt(c2²+c3²) is on the y-axes, where c2 is from the second column and c3 is from the third column, using gnuplot script.
Normally I use something like plot <somefile.dat> using 1:2 but now I have to use second and third column somehow like that using 1:sqrt(2²+3²).
To construct equations from column values from your datafile, gnuplot provides a parenthesis grouping, e.g. (your equation here). In order to define your equation within parenthesis, you refer to the column value wanted by prefixing the column number with a '$' (e.g. $2 refers to the value from column 2, $3 refers to the value from column 3, etc..) and you can use those references as many times as needed within the parenthesis and each use will be replaced by the value from the numbered column.
In your case to have the 1st column be your independent x-values and your equation result the dependent value drawing the numbers from columns 2 & 3, you can do:
plot "somefile.dat" using 1:(sqrt($2*$2+$3*$3))
A short example with the input file as:
$ cat somefile.dat
1 1 1
2 2 2
3 3 3
4 4 4
5 5 5
6 6 6
7 7 7
8 8 8
9 9 9
10 10 10
Creating a short plot file for convenience:
$ cat some.plt
plot "somefile.dat" using 1:(sqrt($2*$2+$3*$3))
You can generate your plot with
$ gnuplot -p some.plt
Look things over and let me know if this is what you needed.
I need a way to find pattern in list of values. In particular every second I get a value in a range (ex. 1-3), and I want to find recurring pattern from this value list.
If i plot this values into an x,y system i'd get something like a Nyquist–Shannon sampling. It could be very interesting to work on this.
I could also plot these values and work on visual pattern recognition (neural networks...).
input:
instant value
1 1
2 2
3 3
4 1
5 2
6 3
7 1
output->1,2,3
What could be the best way to proceed ?
I was wondering if you had a column like
[8 8 8 8 8 1 4 4 4 1 1]'
What code could I write to find the numbers that are not repeated consecutively (non-contiguous)? In this case, what code would I have to write to find row 6? This is for big data.
--Dwight
i am programming a card game and i need to sort a stack of cards by their rank. so that they form a gapless sequence.
in this special game the card with value 2 could be used as a wild card, so for example the cards
2 3 5
should be sorted like this
3 2 5
because the 2 replaces the 4, otherwise it would not be a valid sequence.
however the cards
2 3 4
should stay like they are.
restriction: there an be only one '2' used as a wildcard.
2 2 3 4
would also stay like it is, because the first 2 would replace the ACE (or 1, whatever you call it).
the following would not be a valid input sequence, since one of the 2s must be use as a wildcard and one not. it is not possible to make up a gapless sequence then.
2 4 2 6
now i have a difficulty to figure out if a 2 is used as a wildcard or not. once i got that, i think i can do the rest of the sorting
thanks for any algorithmic help on this problem!
EDIT in response to your clarification to your new requirement:
You're implying that you'll never get data for which a gapless sequence cannot be formed. (If only I could have such guarantees in the real world.) So:
Do you have a 2?
No: your sequence must already be gapless.
Yes: You need to figure out where to put it.
Sort your input. Do you see a gap? Since you can only use one 2 as a wildcard, there can be at most one gap.
No: treat the 2 as a legitimate number two.
Yes: move the 2 to the gap to fill it in.
EDIT in response to your new requirement:
In this case, just look for the highest single gap, and plug it with a 2 if you have a 2 available.
Original answer:
Since your sequence must be gapless, you could count the number of 2s you have and the sizes of all the gaps that are present. Then just fill in the highest gap for which you have a sufficient number of 2s.