magento compilation mode vs apc

magento compilation mode vs apc - performance

Magento has a compilation mode in which you can compile all files of a Magento installation in order to create a single include path to increase performance. http://alanstorm.com/magento_compiler_path http://www.magentocommerce.com/wiki/modules_reference/english/mage_compiler/process/index
In my current shop setup, I have already configured apc to be used as an opcode cache, and am leveraging its performance gains. http://www.aitoc.com/en/blog/apc_speeds_up_Magento.html
My question are:
1) Is there any advantage of using apc over magento compilation mode, or vice versa? I have a dedicated server for magento, and am looking for maximum performance gains.
2) Will it be useful to use both of these togather? Why, or why not?

These do different things so both together is fine. APC will usually give the greater performance gain that simply enabling compilation, but doing both gives you the best of both worlds.
Just remember when you have enabled compilation you need to disable it before making any code changes or updating/installing modules, then recompile after.

As #JohnBoy has already said in his answer, both can be used in conjunction.
Beyond that, another concern was, if using apc would make the compilation redundant.
So I verified the scenario with some siege load tests and overall, there is definite improvement happening.
Here are the test results
siege --concurrent=50 --internet --file=urls.txt --verbose --benchmark --reps=30 --log=compilation.log
-------------|-------------------------------------------------------------------------------------------------------------------------|
|Compilation |Date & Time |Trans |Elap Time |Data Trans |Resp Time |Trans Rate |Throughput |Concurrent |OKAY |Failed |
-------------|-------------------------------------------------------------------------------------------------------------------------|
|No |2013-09-26 12:27:23 | 600 | 202.37 | 6 | 9.79 | 2.96 | 0.03 | 29.01 | 600 | 0|
-------------|-------------------------------------------------------------------------------------------------------------------------|
|Yes |2013-09-26 12:34:05 | 600 | 199.78 | 6 | 9.73 | 3.00 | 0.03 | 29.24 | 600 | 0|
-------------|-------------------------------------------------------------------------------------------------------------------------|
|No |2013-09-26 12:59:42 | 1496 | 510.40 | 17 | 9.97 | 2.93 | 0.03 | 29.23 | 1496 | 4|
-------------|-------------------------------------------------------------------------------------------------------------------------|
|Yes |2013-09-26 12:46:05 | 1500 | 491.98 | 17 | 9.59 | 3.05 | 0.03 | 29.24 | 1500 | 0|
-------------|-------------------------------------------------------------------------------------------------------------------------|
There was a certain amount of variance; however, the good thing was that there was always some improvement, however miniscule be it.
So we can use both.
The only extra overhead here is disabling and recompiling after module changes.

Related

When restoring a database on Cockroach why is there a difference in number of index entries/rows restored?

When backing up a database in CockroachDB the number of rows/index_entries/bytes written are included in the output:
job_id | status | fraction_completed | rows | index_entries | bytes
---------------------+-----------+--------------------+---------+---------------+--------------
903515943941406094 | succeeded | 1 | 100000 | 200000 | 4194304
When restoring the same backup the number of the same metrics are reported:
job_id | status | fraction_completed | rows | index_entries | bytes
---------------------+-----------+--------------------+---------+---------------+--------------
803515943941406094 | succeeded | 1 | 99999 | 199999 | 4194200
What causes the difference between the two and is all my data restored?

A couple of things impact the metrics you are observing, the backup phase may make a copy of specific system tables for metadata, which are not directly restored from the backup image. The configurations from these system tables are applied to your restored database, but they don't count toward the reported metrics for rows / index_entries.
So a relatively small delta between the two values is not unusual in full/backup restore scenarios and not out of the ordinary.

Converting Raw Data to Event Log

I do research in the field of Health-PM and facing an unstructured big data which needs a preprocessing phase for converting to suitable event log.
I've just googled and understood no ProM plug-in, stand-alone code, or script has developed specially for this task. Except Celonis, which has claimed developed an event log convertor. I'm also writing an event log generator code for my specific case study.
I just want to know, is there any business solution, case study or article on this topic which investigated this issue?
Thanks.
Soureh

What do you exactly mean with unstructured? Is this a bad-structured table like the example you provided, or is it data that is not structured at all (e.g. a hard disk with files)?
In the first situation, Celonis indeed provide an option to extract events based on tables using Vertica SQL. In their free SNAP environment you can learn how to do that.
In the latter, I quess that at least semi-structured data is needed to extract events on large scale, otherwise your script has no clue where to look for.

Good question! Many process mining papers mention that most of the existing information systems are PAIS (process-aware information system) hence, qualified to perform process mining on them. This is true, BUT, it does not mean you can get the data out-of-the-box!
What's the solution? You may transform the existing data (typically from a relational database of your business solution, e.g., an ERP or HIS system) into an event log that process mining can understand.
It works like this: you look into the table containing, e.g., patient registration data. You need the patient ID of this table and the timestamp of registration for each ID. You create an empty table for your event log, typically called "Activity_Table". You consider giving a name to each activity depending on the business context. In our example "Patient Registration" would be a sound name. You insert all the patient IDs with their respective timestamp into the Activity_Table followed by the same activity name for all rows, i.e., "Patient Registration". The result looks like this:
|Patient-ID | Activity | timestamp |
|:----------|:--------------------:| -------------------:|
| 111 |"Patient Registration"| 2021.06.01 14:33:49 |
| 112 |"Patient Registration"| 2021.06.18 10:03:21 |
| 113 |"Patient Registration"| 2021.07.01 01:20:00 |
| ... | | |
Congrats! you have an event log with one activity. The rest is just the same. You create the same table for every important action that has a timestamp in your database, e.g., "Diagnose finished", "lab test requested", "treatment A finished".
|Patient-ID | Activity | timestamp |
|:----------|:-----------------:| -------------------:|
| 111 |"Diagnose finished"| 2021.06.21 18:03:19 |
| 112 |"Diagnose finished"| 2021.07.02 01:22:00 |
| 113 |"Diagnose finished"| 2021.07.01 01:20:00 |
| ... | | |
Then you UNION all these mini tables and sort it based on Patient-ID and then by timestamp:
|Patient-ID | Activity | timestamp |
|:----------|:--------------------:| -------------------:|
| 111 |"Patient Registration"| 2021.06.01 14:33:49 |
| 111 |"Diagnose finished" | 2021.06.21 18:03:19 |
| 112 |"Patient Registration"| 2021.06.18 10:03:21 |
| 112 |"Diagnose finished" | 2021.07.02 01:22:00 |
| 113 |"Patient Registration"| 2021.07.01 01:20:00 |
| 113 |"Diagnose finished" | 2021.07.01 01:20:00 |
| ... | | |
If you notice, the last two rows have the same timestamp. This is very common when working with real data. To avoid this, we need an extra column called "sorting" which helps the process mining algorithm to understand the "normal" order of activities with the same timestamp according to the nature of the underlying business. In this case, we can easily know that registration happens before diagnosis hence, we assign a low value (e.g., 1) to all "Patient Registration" activities. The table might look like this:
|Patient-ID | Activity | timestamp |Order |
|:----------|:--------------------:|:-------------------:| ----:|
| 111 |"Patient Registration"| 2021.06.01 14:33:49 | 1 |
| 111 |"Diagnose finished" | 2021.06.21 18:03:19 | 2 |
| 112 |"Patient Registration"| 2021.06.18 10:03:21 | 1 |
| 112 |"Diagnose finished" | 2021.07.02 01:22:00 | 2 |
| 113 |"Patient Registration"| 2021.07.01 01:20:00 | 1 |
| 113 |"Diagnose finished" | 2021.07.01 01:20:00 | 2 |
| ... | | | |
Now, you have an event log that process mining algorithms undertand!
Side note:
there has been many attempts to automate event log extraction process. The works of "Eduardo González López de Murillas" are really interesting if you want to follow this topic. I could also recommend this open-access paper by Eduardo et al. 2018:
"Connecting databases with process mining: a meta model and toolset" (https://link.springer.com/article/10.1007/s10270-018-0664-7)

Cost Savings of ECS/EKS over Straight EC2

I've read plenty of blogs that talk about 25-50% cost savings of moving a micro services fleet from straight EC2 VMs to containers on either ECS or EKS. While that's compelling, I'm scratching my head on how that might be, given the cost estimations using some simple models with the AWS Pricing Calculator. I'm sure I'm oversimplifying the problem here with my estimations below, but the scale of price difference is nearly a factor of five ($68 vs. $319), which begs the question, where are the cost savings?
For instance, assume a small cluster of eight services that work well on a [small t4g][2]:
| Instance | EC2 Type | vCPU | Mem (GB) | Storage (GB) | Monthly Cost |
| ---------- | --------- | ---- | -------- | ------------ | ------------:|
| Service 1 | t4g.small | 2 | 2 | 8 | USD 8.47 |
| Service 2 | t4g.small | 2 | 2 | 8 | USD 8.47 |
| Service 3 | t4g.small | 2 | 2 | 8 | USD 8.47 |
| Service 4 | t4g.small | 2 | 2 | 8 | USD 8.47 |
| Service 5 | t4g.small | 2 | 2 | 8 | USD 8.47 |
| Service 6 | t4g.small | 2 | 2 | 8 | USD 8.47 |
| Service 7 | t4g.small | 2 | 2 | 8 | USD 8.47 |
| Service 8 | t4g.small | 2 | 2 | 8 | USD 8.47 |
| **Totals** | | 16 | 16 | 64 | USD 67.76 |
If I were to move to ECS/EKS and purchase some larger c5's with equivalent vCPU, this is what I'm guessing I'd need to accomplish the same thing:
| Instance | EC2 Type | vCPU | Mem (GB) | Storage (GB) | Monthly Cost |
| ---------- | ---------- | ---- | -------- | ------------ | ------------:|
| Service 1 | c5.2xlarge | 8 | 16 | 32 | USD 159.42 |
| Service 2 | | | | | |
| Service 3 | | | | | |
| Service 4 | | | | | |
| Service 5 | c5.2xlarge | 8 | 16 | 32 | USD 159.42 |
| Service 6 | | | | | |
| Service 7 | | | | | |
| Service 8 | | | | | |
| **Totals** | | 16 | [32][1] | 64 | USD 318.84 |
As I mentioned, I'm sure this is a naive comparison, but I figured I'd end up in the same ballpark and not be off by a factor of 5. I understand that ECS/EKS will give me better resource utilization, but it would need to increase efficiency 470% just to break even, which seems unreasonable.
[1]: While the c5's have twice the memory, given the 1:10 ratio of mem:vCPU, I don't believe this is contributing significantly to the delta.
[2]: Assuming 1 Year Reservation, EC2 Instance Savings Plan, No Upfront

The comparison are not valid because they are different products, so like Vikrant said it is comparing Apples to Oranges.
t4g is a burstable CPU instance
Suitable for websites with spike traffic
relatively small number of users (spike in numbers of visitors)
Before t4g came out, there are t3a, t3, t2, t1... each newer generation offer better performance at lower price. It is also based on graviton processors, not the Intel Xeon C5 is using. Additionally you are also factoring in reserved instance.
When CPU credits runs out for t4g instance...
T instances are so affordable because once the CPU credits run out, the CPU will be slowed down to a crawl. (for example 10% in micro instances)
C5 is for high and constant CPU load
Firstly the pricing is not as competitive with new products released few months ago.
Also it does not offer burstable CPU performance
C5 provide high constant raw CPU power
Focus on CPU and relatively less RAM
C5 offers better network bandwidth
C5 is suitable for applications with constantly heavy CPU load. Web server is usually not demanding for the CPU, and the workload spikes when traffic pattern changes.
Unless for websites with very fast response time requirement that involves CPU heavy calculations, T family instances are more suitable for web servers.
Of course, if the website is serving large amount of people from multiple timezones, then the workload will be high and more stable. In this situation C5 may be a better choice.
You can run the CPU at 100% all the time if you need, there is nothing about CPU credit and it will not slow down. It provide constant and high CPU performance you can use all the time.
Using C5 with T4g
A tactical setup for a high visitors web server is to use C5 to provide a very solid baseline performance, and use T instance to handle extra traffic during busy hours. For example a food ordering platform can use C5 to handle baseline customer orders, and have T instance to take care of peak hours around lunch and dinner.
This way, when traffic drops, the T instance will slowly gain CPU credits. Also you will not have to worry the servers becoming very slow (10% speed) if the CPU runs out because you have a very fast C5 instance to back it up even if all T instances are slowing down.

How faster is tensorflow-gpu with AVX and AVX2 compared with it without AVX and AVX2?

How faster is tensorflow-gpu with AVX and AVX2 compared with it without AVX and AVX2?
I tried to find an answer using Google but with no success. It's hard to recompile tensorflow-gpu for Windows. So, I want to know if it worth it.

If your computation is one giant matmul on CPU, you will get 3x speed-up on Xeon V3 (see benchmark here). But it's also possible to see no speed-up, presumably because there's not enough time spent in high arithmetic intensity ops executed on CPU.
Here's a table from "High Performance Models" guide for training of resnet50 on CPU with difference optimizations. It looks like you can get 2.5 speed-up with best settings
| Optimization | Data Format | Images/Sec | Intra threads | Inter Threads |
: : : (step time) : : :
| ------------ | ----------- | ------------ | ------------- | ------------- |
| AVX2 | NHWC | 6.8 (147ms) | 4 | 0 |
| MKL | NCHW | 6.6 (151ms) | 4 | 1 |
| MKL | NHWC | 5.95 (168ms) | 4 | 1 |
| AVX | NHWC | 4.7 (211ms) | 4 | 0 |
| SSE3 | NHWC | 2.7 (370ms) | 4 | 0 |
If you are able to compile an optimized version for Windows, it would help to mention it in this issue -- https://github.com/yaroslavvb/tensorflow-community-wheels/issues/13 , it seems there's some demand for such a build

Google Compute Engine snapshots not displaying actual space used

If I take a snapshot of a persistent disk, then try to see get information about the snapshot in gcutil, the data is always incomplete. I need to see this data since snapshots are differential.:
server$ gcutil getsnapshot snapshot-3
+----------------------+-----------------------------------+
| name | snapshot-3 |
| description | |
| creation-time | 2014-07-30T06:52:56.223-07:00 |
| status | READY |
| disk-size-gb | 200 |
| storage-bytes | |
| storage-bytes-status | |
| source-disk | us-central1-a/disks/app-db-1-data |
+----------------------+-----------------------------------+
Is there a way to determine what this snapshot is actually occupying? gcutil and the web UI are the only resources I know of, and they are both not displaying this information.

unfortunately it's a bug, known by google developers. They are working on that....

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio