I want to delete the whole column at once if there 1000's of data deleting one by one is time consuming so is there any best way to do it
Deleting this column
And my next question is I need to add new column with default value aftering deleting
i.e :
id | x | y | z
| A | B | C
| A | B | C
| A | B | C
| A | B | C
If above is the the table with thousands of data , I want to delete "z" and add "newColumn" with default value of "D" as below
id | x | y | newColumn
| A | B | D
| A | B | D
| A | B | D
| A | B | D
I have a drug analysis experiment that need to generate a value based on given drug database and set of 1000 random experiments.
The original database looks like this where the number in the columns represent the rank for the drug. This is a simplified version of actual database, the actual database will have more Drug and more Gene.
+-------+-------+-------+
| Genes | DrugA | DrugB |
+-------+-------+-------+
| A | 1 | 3 |
| B | 2 | 1 |
| C | 4 | 5 |
| D | 5 | 4 |
| E | 3 | 2 |
+-------+-------+-------+
A score is calculated based on user's input: A and C, using the following formula:
# Compute Function
# ['A','C'] as array input
computeFunction(array) {
# do some stuff with the array ...
}
The formula used will be same for any provided value.
For randomness test, each set of experiment requires the algorithm to provide randomized values of A and C, so both A and C can be having any number from 1 to 5
Now I have two methods of selecting value to generate the 1000 sets for P-Value calculation, but I would need someone to point out if there is one better than another, or if there is any method to compare these two methods.
Method 1
Generate 1000 randomized database based on given database input shown above, meaning all the table should contain different set of value pair.
Example for 1 database from 1000 randomized database:
+-------+-------+-------+
| Genes | DrugA | DrugB |
+-------+-------+-------+
| A | 2 | 3 |
| B | 4 | 4 |
| C | 3 | 2 |
| D | 1 | 5 |
| E | 5 | 1 |
+-------+-------+-------+
Next we perform computeFunction() with new A and C value.
Method 2
Pick any random gene from original database and use it as a newly randomized gene value.
For example, we pick the values from E and B as a new value for A and C.
From original database, E is 3, B is 2.
So, now A is 3, C is 2. Next we perform computeFunction() with new A and C value.
Summary
Since both methods produce completely randomized input, therefore it seems to me that it will produce similar 1000-value outcome. Is there any way I could prove they are similar?
Given this ridiculously simple data set:
+--------+-----+
| Bucket | Foo |
+--------+-----+
| 1 | A |
| 1 | B |
| 1 | C |
| 1 | D |
+--------+-----+
I want to see the value of Foo in the previous row:
select
foo,
max(foo) over (partition by bucket order by foo rows between 1 preceding and 1 preceding) as prev_foo
from
...
Which gives me:
+--------+-----+----------+
| Bucket | Foo | Prev_Foo |
+--------+-----+----------+
| 1 | A | A |
| 1 | B | A |
| 1 | C | B |
| 1 | D | C |
+--------+-----+----------+
Why do I get 'A' back for the first row? I would expect it to be be null. It's throwing off calculations where I'm looking for that null. I can work around it by throwing a row_number() in there, but I'd prefer to handle it with fewer calcs.
use the LAG function to get previous row:
LAG(foo) OVER(partition by bucket order by foo) as Prev_Foo
I have created a matrix A from dataset1 and I want to use this later in dataset2.
How can I programmatically save this matrix and import it to dataset2?
Consider the following toy datasets:
/* create dataset 1 */
clear
set obs 5
forvalues i = 1 / 5 {
generate norm`i' = rnormal(10, 20)
}
list
+----------------------------------------------------------+
| norm1 norm2 norm3 norm4 norm5 |
|----------------------------------------------------------|
1. | 29.184 47.57735 -6.06845 47.43953 12.10697 |
2. | 9.9639 65.09492 31.92023 18.47133 39.01292 |
3. | 20.88154 -2.251937 1.185946 22.67908 -11.98451 |
4. | 10.03257 13.94616 -10.22853 18.34467 37.34412 |
5. | 17.15362 42.20448 30.38455 -.5586708 20.34926 |
+----------------------------------------------------------+
save data1, replace
/* create dataset 2 */
clear
set obs 5
forvalues i = 1 / 5 {
generate unif`i' = runiform()
}
list
+------------------------------------------------------+
| unif1 unif2 unif3 unif4 unif5 |
|------------------------------------------------------|
1. | .4398566 .222692 .359981 .8840723 .840627 |
2. | .8955406 .7279246 .7385288 .1269085 .2610574 |
3. | .6760237 .5028067 .9236897 .2413106 .8938763 |
4. | .9666038 .0491344 .0098985 .4427792 .8565752 |
5. | .4118744 .368421 .1528643 .8636661 .0944128 |
+------------------------------------------------------+
save data2, replace
One can do this using the svmat command:
use data1, clear
mkmat norm*, matrix(A)
use data2, clear
matrix list A
A[5,5]
norm1 norm2 norm3 norm4 norm5
r1 29.184 47.577354 -6.0684505 47.439529 12.106971
r2 9.9638996 65.094917 31.920233 18.471329 39.01292
r3 20.88154 -2.2519367 1.1859455 22.679077 -11.984506
r4 10.032575 13.946158 -10.228531 18.344669 37.344124
r5 17.153618 42.204475 30.384546 -.55867082 20.349257
svmat A, names(norm)
list
+-----------------------------------------------------------------------------------------------------------------+
| unif1 unif2 unif3 unif4 unif5 norm1 norm2 norm3 norm4 norm5 |
|-----------------------------------------------------------------------------------------------------------------|
1. | .4398566 .222692 .359981 .8840723 .840627 29.184 47.57735 -6.06845 47.43953 12.10697 |
2. | .8955406 .7279246 .7385288 .1269085 .2610574 9.9639 65.09492 31.92023 18.47133 39.01292 |
3. | .6760237 .5028067 .9236897 .2413106 .8938763 20.88154 -2.251937 1.185946 22.67908 -11.98451 |
4. | .9666038 .0491344 .0098985 .4427792 .8565752 10.03257 13.94616 -10.22853 18.34467 37.34412 |
5. | .4118744 .368421 .1528643 .8636661 .0944128 17.15362 42.20448 30.38455 -.5586708 20.34926 |
+-----------------------------------------------------------------------------------------------------------------+
Note that this solution will work if the clear matrix and/or clear all have not been invoked.
Does anyone know the rules for valid Ruby variable names? Can it be matched using a RegEx?
UPDATE: This is what I could come up with so far:
^[_a-z][a-zA-Z0-9_]+$
Does this seem right?
Identifiers are pretty straightforward. They begin with letters or an underscore, and contain letters, underscore and numbers. Local variables can't (or shouldn't?) begin with an uppercase letter, so you could just use a regex like this.
/^[a-z_][a-zA-Z_0-9]*$/
It's possible for variable names to be unicode letters, in which case most of the existing regexes don't match.
varname = "\u2211" # => "∑"
eval(varname + '= "Tony the Pony"') => "Tony the Pony"
puts varname # => ∑
local_variable_identifier = /Insert large regular expression here/
varname =~ local_variable_identifier # => nil
See also "Fun with Unicode" in either the Ruby 1.9 Pickaxe or at Fun with Unicode.
According to http://rubylearning.com/satishtalim/ruby_names.html a Ruby variable consists of:
A name is an uppercase letter,
lowercase letter, or an underscore
("_"), followed by Name characters
(this is any combination of upper- and
lowercase letters, underscore and
digits).
In addition, global variables begin with a dollar sign, instance variables with a single at-sign, and class variables with two at-signs.
A regular expression to match all that would be:
%r{
(\$|#{1,2})? # optional leading punctuation
[A-Za-z_] # at least one upper case, lower case, or underscore
[A-Za-z0-9_]* # optional characters (including digits)
}x
Hope that helps.
I like #aboutruby's answer, but just to complete it, here's the equivalent using POSIX bracket expressions.
/^[_[:lower:]][_[:alnum:]]*$/
Or, since a-z is actually shorter than [:lower:]:
/^[_a-z][_[:alnum:]]*$/
I think /^(\$){0,1}[_a-zA-Z][a-zA-Z0-9_]*([?!]){0,1}$/ is a bit closer to what you will need...
It depends on whether you want to match method names as well.
If you are trying to match a name that might be encountered in an expression, then it might start with $ and it might end with ? or !. If you know for sure that it is just a local variable then the rule will be much simpler.
i was trying to figure one out for a rails patch, and Matthew Draper wrote this one, using the ruby parser as a reference:
/\A(?![A-Z0-9])(?:[[:alnum:]_]|[^\0-\177])+\z/
And here it is, straight from the horse's mouth. (The horse in this case is the Draft ISO Ruby Specification):
local-variable-identifier → ( lowercase-character | _ ) identifier-character *
identifier-character → lowercase-character | uppercase-character | decimal-digit | _
uppercase-character → A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
lowercase-character → a | b | c | d | e | f | g | h | i | j | k | l | m | n | o | p | q | r | s | t | u | v | w | x | y | z
decimal-digit → 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
In Ruby 1.9, using named groups, you can translate this literally:
local_variable_identifier = %r{
(?<uppercase_character> A | B | C | D | E | F | G | H | I | J | K | L | M
| N | O | P | Q | R | S | T | U | V | W | X | Y | Z
){0}
(?<lowercase_character> a | b | c | d | e | f | g | h | i | j | k | l | m
| n | o | p | q | r | s | t | u | v | w | x | y | z
){0}
(?<decimal_digit> 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9){0}
(?<identifier_character> \g<lowercase_character>
| \g<uppercase_character>
| \g<decimal_digit>
| _
){0}
( \g<lowercase_character> | _ ) \g<identifier_character>*
}x
Of course, this is not how you would really write it.