I have a panel data set where each user is identified by a unique 16-digit hexadecimal userid. When I import this into Stata the userid turns red as Stata does not recognize this format. How can I convert this hexadecimal into a unique numeric identifier that Stata can recognize so that I can do further analysis with the panel?
You should really provide a minimal working example.
I guess you are looking for something like this?
An example would be something like:
clear
input oldid var1 var2
"abc" 1 20
"abc" 2 21
"abc" 4 25
"def" 5 23
"def" 5 21
"hij" 6 27
end
egen newid = group(oldid)
Related
I want to find a specific character in a given string of number for example if my input is:
1 4 5 7 9 12
Then for 4 the answer should be 1. My code is as follows:
secarr = second.split(" ")
answer = secarr.index(number) #here number is a variable which gets the character
puts answer
The above method works if I write "4" instead of number or any other specific character but does not work if I write a variable. Is there a method in ruby to do the same?
This is probably your variable number is an Integer, and secarr is an Array of Strings. Try to cast the number to string:
answer = secarr.index(number.to_s)
Forgive me, I am very new to Ruby and relatively new to programming in general. My problem is probably not hard, but I have googled until my fingers bled looking for a solution and I just cant get it.
I have a line of text that looks like this:
6 19 11 28 22 localhost G6UI ip0 cameraLink cameraLinkMissingScans 15116
After all is said and done, I want it to look like this:
6.19.2014,11.28.22,localhost,G6UI,ip0,cameraLink cameraLinkMissingScans 15116
I have accomplished this in Bash (I am essentially just making a CSV file, with the time and date formatted the way I want it) but, for reasons to lengthy to explain, Id like to do it with Ruby.
I have a start, although its probably a bit sad:
myLineOfText.sub!(/[^-a-zA-Z0-9]/,'\1.\2.')
Which gives me this:
6..19 11 28 22 localhost G6UI ip0 cameraLink cameraLinkMissingScans 15116
Any help would be greatly appreciated, I just need something to get me started.
Thanks in advance.
If you can be sure the format always remains the same, you can do:
str.sub!(/(\d+) (\d+) (\d+)/,'\1.\2.\3').gsub!(/ /,',')
Example:
str='6 19 11 28 22 localhost G6UI ip0 cameraLink cameraLinkMissingScans 15116'
str.sub!(/(\d+) (\d+) (\d+)/,'\1.\2.\3').gsub!(/ /,',')
puts str
=> "6.19.11,28,22,localhost,G6UI,ip0,cameraLink,cameraLinkMissingScans,15116"
With questions such as this one, the answer depends on what is fixed and what is variable in the data's format. I have assumed:
there are at least nine substrings separated with spaces
substrings 0 and 1 (base 0) correspond to the month and year, and are to be combined with the literal "2014" to form a date of the form dd.mm.2014
substrings 2-4 are to be joined with '.' and followed with ','
substrings 5-7 are to be joined with ',' and followed with ','
the remainder of the substrings are to be joined with a space
I don't think a regex is the right tool for the formating; rather just split the string on spaces and form the new string by using a series of String#join's, combining the resulting substrings in the obvious way:
s = "6 19 11 28 22 localhost G6UI ip0 cameraLink cameraLinkMissingScans 15116"
a = s.split(' ')
#=> ["6", "19", "11", "28", "22", "localhost", "G6UI", "ip0", "cameraLink",
# "cameraLinkMissingScans", "15116"]
a[0]+'.'+a[1]+'.2014,'+a[2..4].join('.')+','+a[5..7].join(',')+','+
a[8..-1].join(' ')
#=> "6.19.2014,11.28.22,localhost,G6UI,ip0,cameraLink cameraLinkMissingScans 15116"
I researched delimit issue for a while and I pull useful codes here and there, but I can't quite put it together.
I'm trying to parse the string by word in SSIS and I NEED help on vb script component.
I need to delimit my column data to the following deliminator:
"AND","OR","**", ","
I have a table like this
ID Description
1 apple AND orange, tangerine
2 avocado OR guacamole AND pineapple OR fruit
3 watermelon ** melon
And I want to parse the data like this
ID Description
1 apple
1 orange
1 tangerine
2 avocado
2 guacamole
2 pineapple
2 fruit
3 watermelon
3 melon
Thank you.
In order to parte string by words is enought a combination of replace a split:
(I assume that you know how to take ID)
split(
replace(
replace (
replace( Description, "AND", ","),
"OR", ","
),
"**", ","
), ","
)
this return a array of elements as you ask for:
id = 2
a=my_previous_functions_combination("avocado OR guacamole AND pineapple OR fruit")
for each fruit in a
do something with id and fruit
next
so far, help with vb. I don't know what you want to do in SSIS: a calculated member? a named set? Extend fact table? Read second answer part:
Second part:
To convert a row in multiples row you need a script. You can find a good example in SSIS - Script Component, Split single row to multiple rows post.
I am finding the difference between two columns in a file like
cat "trace-0-dir2.txt" | awk '{print expr $2-$1}' | sort
this gives me values like :
-1.28339e+09
-1.28339e+09
-1.28339e+09
-1.28339e+09
I want to avoid the rounding off and want the exact value.How can this be achieved?
FYI ,trace-0-dir2.txt contains:
1283453524.342134 65337.141749 10 2
1283453524.556784 65337.388047 11 2
1283453524.556794 65337.411165 12 2
1283453524.556806 65337.435947 13 2
1283453524.556811 65337.435989 14 2
1283453524.556816 65337.453931 15 2
1283453524.771522 65337.484866 16 2
printf function can help get you the formatting you need. You don't need expr and you don't need cat. awk can do any calculation and you can invoke awk directly on the file.
You can alter the 20.20 to any number based on the format you are looking for.
[jaypal:~/Temp] cat file0
1283453524.342134 65337.141749 10 2
1283453524.556784 65337.388047 11 2
1283453524.556794 65337.411165 12 2
1283453524.556806 65337.435947 13 2
1283453524.556811 65337.435989 14 2
1283453524.556816 65337.453931 15 2
1283453524.771522 65337.484866 16 2
[jaypal:~/Temp] awk '{ printf("%20.20f\n", $2-$1)}' file0
-1283388187.20038509368896484375
-1283388187.16873693466186523438
-1283388187.14562892913818359375
-1283388187.12085914611816406250
-1283388187.12082219123840332031
-1283388187.10288500785827636719
-1283388187.28665614128112792969
From the man page:
Field Width:
An optional digit string specifying a field width; if the output string has fewer characters than the field width it will be blank-padded on the left (or right, if the left-adjustment indicator has been given) to make up the field width (note that a leading zero is a flag, but an embedded zero is part of a field width);
Precision:
An optional period, `.', followed by an optional digit string giving a precision which specifies the number of digits to appear after the decimal point, for e and f formats, or the maximum number of characters to be printed from a string; if the digit string is missing, the precision is treated as zero;
hello i have i want to do something like this.
i have 4 rows with unique id 1,2,3,4 all four rows contains some string like
option1,option2,option3,option4
now i want to add "a ) " to the option1, "b ) " to the option2 and so on so is there a way i can do this with a query.currently i am adding these to a lots of rows manually
It's not clear exactly by what logic you want to select the letter to prepend to field somestring, but if for example it's a "Caesar's cypher" (1 gives 'a', 2 gives 'b' etc) based on the id field, as your question suggests, then this should work:
UPDATE sometable
SET somestring = (
substr('abcdefghijklmnopqrstuvwxyz', id, 1) ||
' ) ' || somestring)
WHERE id <= 26;
...for no more than 26 rows of course, since beyond that the logic must change and obviously we can't guess just how you want to extend it (use id modulo 26 + 1, use more characters than just lowercase letters, or ...?) since you give no clue on why you want to do this.