Which model should I use? xtlogit or xtprobit - panel

I have the following panel data set with very large N (500,000) and small T (15 years). My dependent variable is Project1 or project 2. I want to estimate the likelihood of Project dependent on treated with year and village fixed effects. For the continuous dependent variable, I was using reghdfe.
The dependent variable is simply that when a village gets the project the dummy is equal to 1 and remains 1 for the subsequent years.
I am aware that I cannot use "probit" command in STATA as I have a panel. Can you suggest which model should I use?
| village | population | year | project_1 | project_2 | treated |
| A | 100 | 2001 | 0 | 0 | 0 |
| A | 100 | 2002 | 1 | 0 | 0 |
| A | 100 | 2003 | 1 | 0 | 1 |
| A | 100 | 2004 | 1 | 0 | 1 |
| A | 100 | 2005 | 1 | 0 | 1 |
| B | 200 | 2001 | 0 | 0 | 0 |
| B | 200 | 2002 | 0 | 0 | 1 |
| B | 200 | 2003 | 0 | 1 | 1 |
| B | 200 | 2004 | 0 | 1 | 1 |
| B | 200 | 2005 | 0 | 1 | 1 |
| C | 150 | 2001 | 0 | 0 | 0 |
| C | 150 | 2002 | 0 | 0 | 0 |
| C | 150 | 2003 | 0 | 0 | 0 |
| C | 150 | 2004 | 1 | 0 | 0 |
| C | 150 | 2005 | 1 | 0 | 1 |
| D | 175 | 2001 | 0 | 0 | 0 |
| D | 175 | 2002 | 0 | 0 | 0 |
| D | 175 | 2003 | 0 | 0 | 0 |
| D | 175 | 2004 | 0 | 0 | 1 |
| D | 175 | 2005 | 0 | 0 | 1 |

Your question has two parts. Which model of Logit and Probit is more appropriate for you, and how to implement the appropriate model in Stata. As #NickCox mentioned, the former is most appropriate for Cross Validated, and has received robust discussion there: Difference between logit and probit models


Recording particular cell of a table on the basis of a certain condition using grep

I have a number of tables that looks as follows:
time | node | left |LP iter|LP it/n|mem/heur|mdpt |vars |cons |rows |cuts |sepa|confs|strbr| dualbound | primalbound | gap | compl.
0.0s| 1 | 0 | 100 | - | 1046k | 0 | 100 | 102 | 100 | 0 | 0 | 0 | 0 | -- | 9.999990e+05*| Inf | unknown
* 0.3s| 1 | 0 | 100 | - | LP | 0 | 200 | 102 | 100 | 0 | 0 | 0 | 0 | -- | 5.587300e+04 | Inf | unknown
12.0s| 1 | 0 | 239 | - | 1781k | 0 | 239 | 102 | 100 | 0 | 0 | 0 | 0 | 5.577800e+04 | 5.587300e+04 | 0.17%| unknown
12.1s| 1 | 0 | 287 | - | 2595k | 0 | 239 | 102 | 935 | 835 | 1 | 0 | 0 | 5.577800e+04 | 5.587300e+04 | 0.17%| unknown
66.8s| 1 | 0 | 422 | - | 3061k | 0 | 336 | 102 | 935 | 835 | 1 | 0 | 0 | 5.577800e+04 | 5.587300e+04 | 0.17%| unknown
89.4s| 1 | 0 | 481 | - | 3218k | 0 | 361 | 102 | 935 | 835 | 1 | 0 | 0 | 5.580100e+04 | 5.587300e+04 | 0.13%| unknown
89.5s| 1 | 0 | 579 | - | 3513k | 0 | 361 | 102 |1335 |1235 | 2 | 0 | 0 | 5.580100e+04 | 5.587300e+04 | 0.13%| unknown
100s| 1 | 0 | 715 | - | 3837k | 0 | 403 | 102 |1335 |1235 | 2 | 0 | 0 | 5.583250e+04 | 5.587300e+04 | 0.07%| unknown
I'm interested in recording the first numeric value in the gap column (second last column of the table). The gap column could either have Inf or x.xx% values in it. If all the values in the gap column are Inf, then I would simply record Inf, otherwise, I would like to record the first numeric value. For e.g. in the above table, the value that I would like to record is 0.17. I tried many different ways but couldn't achieve any success. It would be really great if someone could provide some guidance as to how to achieve the above-mentioned objective. Thanks !
You may use this awk solution:
awk -F '[[:blank:]]*\\|[[:blank:]]*' '
NR > 1 && (!v || v == "Inf") {
v = ($(NF-1) == "Inf" ? $(NF-1) : $(NF-1)+0)
print v
}' file

How to compare two circuits based on their utilization

I have some hardware IPs that I need to synthesize. And the IP contains several generic parameters I can play with. Each combination of parameters gives me a different utilization report after synthesis and implementation.
So for example for two different configurations Design_1 and Design_2, I get the following in Vivado 2018.1. The 3rd line is the ratio of the values of Design_2 devided by values of Design_1.
So as you can see in this simple example, Design_2 has less Slice LUTs but slightly more F7 Muxes.
My question is how to conclude about the cost of each one? Should I privilege Slice LUTs or Registers ...etc?
| Design_1 | 34124 | 16913 | 1453 | 91 | 10272 | 31538 | 2586 | 9020 | 37 | 11 | 125 | 0 | 1 | 1 | 4 | 2 | 1 | 0 | 4 | 2 | 16 | 16 | 46 | 10 | 0 | 2 | 2 | 0 | 2 | 0 | 4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Design_2 | 34097 | 16913 | 1550 | 91 | 10189 | 31511 | 2586 | 9021 | 37 | 11 | 125 | 0 | 1 | 1 | 4 | 2 | 1 | 0 | 4 | 2 | 16 | 16 | 46 | 10 | 0 | 2 | 2 | 0 | 2 | 0 | 4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| -------- | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
| (2)/(1) | 0.999208768022506 | 1 | 1.06675843083276 | 1 | 0.991919781931464 | 0.999143889910584 | 1 | 1.00011086474501 | 1 | 1 | 1 | #DIV/0! | 1 | 1 | 1 | 1 | 1 | #DIV/0! | 1 | 1 | 1 | 1 | 1 | 1 | #DIV/0! | 1 | 1 | #DIV/0! | 1 | #DIV/0! | 1 | #DIV/0! | #DIV/0! | #DIV/0! | #DIV/0! | #DIV/0! | #DIV/0! | #DIV/0! | #DIV/0! |
It's depending on your needs, LUTs and F7 Muxes are differents physical cells in your FPGA. So even if you don't use its, its will be there.
If you have one ressource more critical than the other, you should try to minimize the utilisation of the critical ressource to simplify the place and route.
If you have nothing critical, I think the better is to use F7 Muxes first because Slice LUTs are more flexible for the rest of your design.

Pivot Table in Hive and Create Multiple Columns for Unique Combinations

I want to pivot the following table
| ID | Code | date | qty |
| 1 | A | 1/1/19 | 11 |
| 1 | A | 2/1/19 | 12 |
| 2 | B | 1/1/19 | 13 |
| 2 | B | 2/1/19 | 14 |
| 3 | C | 1/1/19 | 15 |
| 3 | C | 3/1/19 | 16 |
| ID | Code | mth_1(1/1/19) | mth_2(2/1/19) | mth_3(3/1/19) |
| 1 | A | 11 | 12 | 0 |
| 2 | B | 13 | 14 | 0 |
| 3 | C | 15 | 0 | 16 |
I am new to hive, i am not sure how to implement it.
NOTE: I don't want to do mapping because my month values change over time.

mysql table unaligned in console output when using UTF8

I like to use mysql client. But when using UTF-8, the tables on the console are unaligned:
> set names utf8;
> [some query]
| RuleId | TaxonId | Note | NoteSci | MinCount | DayFrom | MonthFrom | DayTo | MonthTo | ExtraNote |
| 722 | 10090 | sedmihlásek malý | Hippolais caligata | 1 | 1 | 1 | 31 | 12 | NULL |
| 727 | 10059 | Anseranas semipalmata | husovec strakatý | 1 | 1 | 1 | 31 | 12 | NULL |
| 728 | 10062 | Cygnus atratus | labuť černá | 1 | 1 | 1 | 31 | 12 | NULL |
| 729 | 10094 | Anser cygnoides | husa labutí | 1 | 1 | 1 | 31 | 12 | NULL |
| 730 | 10063 | Tadorna cana | husice šedohlavá | 1 | 1 | 1 | 31 | 12 | NULL |
| 731 | 10031 | Cairina moschata f. domestica | pižmovka domácí | 20 | 1 | 1 | 31 | 12 | NULL |
| 732 | 10088 | Cairina scutulata | pižmovka bělokřídlá | 1 | 1 | 1 | 31 | 12 | NULL |
| 733 | 10087 | Anas sibilatrix | hvízdák chilský | 1 | 1 | 1 | 31 | 12 | NULL |
| 734 | 10077 | Anas platyrhynchos f. domestica | kachna domácí | 1000 | 1 | 1 | 31 | 12 | NULL |
| 735 | 10086 | Anas hottentota | čírka hottentotská | 1 | 1 | 1 | 31 | 12 | NULL |
This is apparently because mysql client will compute the widths of the columns using string length which doesn't take UTF-8 characters into account - so then there is exactly one space missing for each accented character (because these actually take two bytes).
Do you know possible workaround for this problem?
Run your mysql client with charset option:
mysql -uUSER -p DATABASE --default-character-set=utf8
(USER and DATABASE should be replaced with actual credentials data)

Ruby on Rails: Rake: rake stats didn't add my field to the correct value?

Before my rake stats modification
| Name | Lines | LOC | Classes | Methods | M/C | LOC/M |
| Controllers | 5037 | 3936 | 31 | 292 | 9 | 11 |
| Helpers | 150 | 128 | 0 | 17 | 0 | 5 |
| Models | 1523 | 1166 | 42 | 123 | 2 | 7 |
| Libraries | 633 | 415 | 4 | 65 | 16 | 4 |
| Functional tests | 289 | 228 | 13 | 0 | 0 | 0 |
| Unit tests | 560 | 389 | 30 | 0 | 0 | 0 |
| Model specs | 1085 | 904 | 0 | 3 | 0 | 299 |
| View specs | 88 | 75 | 0 | 0 | 0 | 0 |
| Controller specs | 468 | 388 | 0 | 2 | 0 | 192 |
| Total | 9833 | 7629 | 120 | 502 | 4 | 13 |
Code LOC: 5645 Test LOC: 1984 Code to Test Ratio: 1:0.4
now, when I add:
::STATS_DIRECTORIES << %w(Factories\ specs test/factories) if File.exist?('test/factories')
::CodeStatistics::TEST_TYPES << "Factory specs" if File.exist?('test/factories')
around line 120, it should increase test LOC, right?
| Controllers | 5037 | 3936 | 31 | 292 | 9 | 11 |
| Helpers | 150 | 128 | 0 | 17 | 0 | 5 |
| Models | 1523 | 1166 | 42 | 123 | 2 | 7 |
| Libraries | 633 | 415 | 4 | 65 | 16 | 4 |
| Functional tests | 289 | 228 | 13 | 0 | 0 | 0 |
| Unit tests | 560 | 389 | 30 | 0 | 0 | 0 |
| Model specs | 1085 | 904 | 0 | 3 | 0 | 299 |
| View specs | 88 | 75 | 0 | 0 | 0 | 0 |
| Controller specs | 468 | 388 | 0 | 2 | 0 | 192 |
| Factories specs | 144 | 119 | 0 | 0 | 0 | 0 |
| Total | 9977 | 7748 | 120 | 502 | 4 | 13 |
Code LOC: 5764 Test LOC: 1984 Code to Test Ratio: 1:0.3
Instead of adding tho 144 lines from factories to test LOC, it adds them to code LOC =\
How do I get the line count to be in Test LOC?
You're adding something called "Factories specs" (plural) to the STATS_DIRECTORIES array, but you call it "Factory specs" (singular) when you add it to TEST_TYPES array -- so when rake:stat hits your test/factories folder, it looks for "Factories specs" in TEST_TYPES, doesn't find it, and assumes it's code, not tests. You need to call it the same thing in both places:
::STATS_DIRECTORIES << %w(Factory\ specs test/factories) if File.exist?('test/factories')
::CodeStatistics::TEST_TYPES << "Factory specs" if File.exist?('test/factories')
