PyTorch Lightning - Is Trainer necessary to use multiple GPUs? - pytorch-lightning

If I want to take advantage of PyTorch Lightning's ability to train using multiple GPUs, do I have to use their Trainer?

if you want to use all the Lightning features (even multi-GPU) such as loggers, metrics tracking, and checkpointing, then you would need to use Trainer. On the other hand, if you are fine with some limited functionality you can check out the recent LightningLite.

Related

How do I setup esrally to use with elassandra and my own tests?

I'm wondering whether others have attempt benchmarking Elassandra (more specifically, I'm using express-cassandra) using esrally. I'm hoping to not spend to much more time on esrally if that's not a good solution to test Elassandra.
Reading the documentation, it looks like Rally is capable of starting from scratch: Download Elasticsearch, install the source, build it, run it, connect, create a full schema, then start testing with data filling the schema (possibly done with some random data), do queries, ...
I already have everything in place and the only thing I really want to see a few things such as:
Which of 10 different memory setup is faster.
Which type of searches work, whether option 1, 2 and 3 from my existing software create drastic slow downs or not...
Whether insertion while doing searches have a effects on the speed of my searches.
I'm not going to change many parameters other than the memory (-Xmx, -Xms, maybe some others... like cached row in a separate heap.) For sure, I want to run all the tests with the latest Elassandra and not consider rebuilding or anything of the sort.
From reading the documentation, there is no mention of Elassandra. I found total of TWO PAGES in Google about testing with esrally and Elassandra and that did not boost my confidence that it's doable...
I would imagine that I have to use the benchmark-only pipeline. That at least removes all the gathering of the source, building, etc. I guess it also reduces the number of parameters I get in the resulting benchmark, but I don't need all the details...
Have you had any experience with such a setup? (Elassandra + esrally)
Yes, esrally works with Elassandra by using the --benchmark-only option.
To automate the creation of elassandra clusters to benchmark, you could either use ecm or k8s helm chart.
For instance, using ccm :
ecm create bench_cluster -v 6.2.3.10 -n 3 -s -e
esrally --pipeline=benchmark-only --target hosts=127.0.0.1:9200,127.0.0.2:9200,127.0.0.3:9200
ecm remove bench_cluster
For testing specific scenarios, you can write custom tracks.

OpenModelica FMU for Model Exchange - Parallelized?

Is the FMU for Model Exchange code (generated by OpenModelica) parallelized?
If I want parallelization, must I run different FMUs for Co-Simulation in parallel?
Thanks
Is a bit unclear what you want:
Do you want automatic parallelization of the simulation code inside the FMU?
we have some support for this using OpenMP and also some experimental support using TBB.
Do you want to run the FMUs in parallel on different threads or different processes?
we are working on a tool for FMI co-simulation called OMSimulator which will include some parallelization at the master-algorithm level

How do you typically make your CPLEX Studio models more user friendly?

Non-optimization question about CPLEX Studio....
So you make your awesome OPL model in CPLEX Studio and it brilliantly solves your amazeballs problem.
Suppose you wanted to allow other users to access this model in a nice user friendly way: Basically, specify some simple parameters in a simple user interface (without having to edit code etc), then, output the solution in some arbitrary way you coded up like an Excel file, HTML report, or whatever.
1) What are the options for a user interface, without adding in too much other technology?
(eg. I currently have a Java program doing exactly this, but I'd rather not rely on Java code / programmers / compiling / hosting source code etc)
2) What are the options for triggering some user friendly output, eg. in a standard format like Excel, some HTML report you coded up, or maybe just triggering a Python script, etc?
(eg. I currently render them in a Java FX application on grids, charts and HTML windows, I would prefer something more lightweight and accessible, like Python etc, HTML5 output)
3) In industry, what is the typical role of CPLEX in a production environment: Is it just called by an external application (Java/.NET etc), or is CPLEX Studio used more actively?
Embed the optimisation model in wider business applications using Java, C#, Python, C++, whatever. Make it just part of the normal business systems that people use. It is just software. Make it so that the users really appreciate that the new software actually benefits them each time they use it. Make it easier to use the model than to not use it. Hide the model inside other software. Probably never even mention optimisation to your end users.
The best model in the world that could deliver amazing benefits will actually achieve nothing of practical value if it doesn't actually get used.
If your target audience or users have to do extra stuff or perform extra steps to use your model, then it will likely not get used very much and may wither and die. If they have to learn new applications etc to use it, it probably won't get used by most people.
By making your model part of their normal day-to-day processes, it will get used, and the practical benefits will come.
I have implemented and support a number of live optimisation applications in several large companies, making decisions that directly affect billions of pounds/dollars of products/revenues per year. Almost all of them have the real optimisation models totally hidden from the users, most of whom have no idea of optimisation or CPLEX; the software in their business systems just works.
There are many options. You may write the model with an algebraic modeling language (AML) like OPL or a general purpose language. (GPL)
If you use OPL then you may call your model from many GPL like C++, Java, Python ...
Or you could plug that model in an existing application.
You could call OPL from Excel or DSX Python Notebook as can be read at https://www.ibm.com/developerworks/community/forums/html/topic?id=306f3ded-33b8-4d9a-8568-b4288aa64265&ps=25
See the survey I mentioned in 1.
Some users use CPLEX OPL IDE in order to make decisions and simulations
Other use Decision Optimization Centre : https://www.ibm.com/us-en/marketplace/ibm-decision-optimization-center
Finally, some write new applications from zero or plug the model into an existing application.

Use Cases of NIFI

I have a question about Nifi and its capabilities as well as the appropriate use case for it.
I've read that Nifi is really aiming to create a space which allows for flow-based processing. After playing around with Nifi a bit, what I've also come to realize is it's capability to model/shape the data in a way that is useful for me. Is it fair to say that Nifi can also be used for data modeling?
Thanks!
Data modeling is a bit of an overloaded term, but in the context of your desire to model/shape the data in a way that is useful for you, it sounds like it could be a viable approach. The rest of this is under that assumption.
While NiFi employs dataflow through principles and design closely related to flow based programming (FBP) as a means, the function is a matter of getting data from point A to B (and possibly back again). Of course, systems aren't inherently talking in the same protocols, formats, or schemas, so there needs to be something to shape the data into what the consumer is anticipating from what the producer is supplying. This gets into common enterprise integration patterns (EIP) [1] such as mediation and routing. In a broader sense though, it is simply getting the data to those that need it (systems, users, etc) when and how they need it.
Joe Witt, one of the creators of NiFi, gave a great talk that may be in line with this idea of data shaping in the context of Data Science at a Meetup. The slides of which are available [2].
If you have any additional questions, I would point you to check out the community mailing lists [3] and ask any additional questions so you can dig in more and get a broader perspective.
[1] https://en.wikipedia.org/wiki/Enterprise_Integration_Patterns
[2] http://files.meetup.com/6195792/ApacheNiFi-MD_DataScience_MeetupApr2016.pdf
[3] http://nifi.apache.org/mailing_lists.html
Data modeling might well mean many things to many folks so I'll be careful to use that term here. What I do think in what you're asking is very clear is that Apache NiFi is a great system to use to help mold the data into the right format and schema and content you need for your follow-on analytics and processing. NiFi has an extensible model so you can add processors that can do this or you can use the existing processors in many cases and you can even use the ExecuteScript processors as well so you can write scripts on the fly to manipulate the data.

Simulate z/OS FINDREP

Is it possible to replace FINDREP with options STARTPOS, ENDPOS with somthings else in DFSORT or MFSORT ?
Example: OUTREC|INREC FINDREP=(IN=C'CHARS',OUT=C'CHARS')
From the Micro Focus documentation relating to MFSORT and MFJSORT:
Note: MFJSORT and MFSORT are updated with new features on a regular basis but they do not offer a complete emulation of third-party sort utilities. If there are particular features that you need to use, please contact SupportLine to establish if they are available in MFJSORT or MFSORT.
Presumably Micro Focus, or someone, has been involved in planning the migration from z/OS? This should have included an analysis of the work required for providing equivalence to SORT/ICETOOL steps. If not, there may be considerable work which hasn't been budgeted for.
If FINDREP is not available in either MFSORT or MFJSORT (check with SupportLine, as Micro Focus suggests) and if they cannot be made available (if Micro Focus have missed this, it may be possible to apply pressure on that issue) the you do need an alternative.
If is possible, especially since you mention the use of STARTPOS and ENDPOS, that CHANGE, which is available from a Micro Focus product according to this linke: http://documentation.microfocus.com/help/index.jsp?topic=%2Fcom.microfocus.eclipse.infocenter.edtest%2FHRFLRHSORT2U.html, which shows this code:
Sort C'cyymmdd'
SORT FIELDS=(1,7,BI,A) * sort C'cyymmdd'
use mfs110a.in org ls record (f 40)
* Transform C'cyymmdd' to C'yyyymmdd'
OUTFIL OUTREC=(1,1,CHANGE=(2, * change C'c' as follows:
C'0',C'19', * C'0' to C'19'
C'1',C'20', * C'1' to C'20'
C'2',C'21'), * C'2' to C'21'
NOMATCH=(C'99')
2,6) * copy C'yymmdd'
give sortout.dat
Note that you have "extra" commands required for MFSORT/MFJSORT (like the use and give and the inclusion of a program-name.
Judging from the above code, Micro Focus have not made IFTHEN available. That will have substantial impact on anything remotely "complex" that is currently being done with DFSORT steps or ICETOOL steps with USING.
Micro Focus do support E15 and E35 "exit"s. That means you can write a program to make changes at the input stage, and at the output stage. You tell MFSORT/MFJSORT to use that program(s) and you implement missing functionality. On the Mainframe those exits can be written in Enterprise COBOL. I assume that in your new environment they can be written in Micro Focus COBOL. If so, FINDREP can be done with INSPECT. But, given that you want to use STARTPOS and ENDPOS you will/may need numerous exit-programs.
You should also check, if your systems use ICETOOL, that all the operators that you use are available under Micro Focus's "emulation".
The good news is that the Operating System you are migrating to will have lots of tools which will be able to effect coding which is not supported, but that really needs to be assessed and budgeted.
DFSORT is exceptionally fast, especially at IO, but not limited to that. You may expect different relative timings for your replacements, especially if you need exit-programs or further processing with "shell" programs on your new OS. Again this should have been considered prior to this point, but the worry would be that it has not.
If yours is other than a small z/OS system, be aware that you will be in for many shocks as the new distributed system fails to "scale" how you imagine.
If you are migrating yourselves, rather than a Micro Focus project team, you will almost certainly need the support of additional specialist(s). Even with a Micro Focus project team, if you have "complex" SORT steps, additional specialist support will give you huge benefits.
AHLSORT is an alternative to MFSORT that fully supports the DFSORT FINDREP command.

Resources