How big can the 8-bit CPU (8086) support? - cpu

I have a 8-bit Famiclone, which has its only DOS.. just wondering how much memory in theory can it support?
https://helloacm.com/the-8-bit-dos-by-famicom-clone-bbgdos-in-the-1990s/

Related

Tensorflow Neural Network faster on CPU than GPU

I have created a neural network classifier with 2 hidden layers. Hidden Layers units [50,25].
The model is training much faster on CPU than GPU.
My questions are :
Is this expected? I do see that the architecture is small but not that small to be faster on CPU :/
How should I debug this?
I tried increasing batch size, expecting that after some batch_size GPU will overtake CPU. But I don't see that happening.
My code is in Tensorflow 1.4.
Given the size of the network (very small) I'm inclined to think this is a DMA issue: copying data from the CPU to the GPU is expensive, maybe expensive enough that it makes up for the GPU being much faster at doing larger matrix multiplications.

cpu frequency impact on build graphic card

I'm working on building a Dynamic Voltage Frequency Scaling (DVFS) algorithm for a video decoding application operating on an Intel core i7 6500U CPU (Skylake). The application is to support both software as well as hardware decoder modules and the software decoder is working as expected. It controls the operational frequency of the CPU which eventually controls the operational voltage, thereby reducing the overall energy consumption.
My question is regarding the hardware decoder which is available in the Intel skylake processor (Intel HD graphics 520) which performs the hardware decoding. The experimental results for the two decoders suggest that the energy consumption reduction is much less in the hardware decoder compared to the software decoder when using the DVFS algorithm.
Does the CPU frequency level adjusted on the software before passing the video frame to be decoded on the hardware decoder, actually have an impact on the energy consumption of the hardware decoder?.
Does the Intel HD graphics 520 GPU on the same chip as the CPU have any impact on the CPU's operational frequency and the voltage level?
Why did you need to implement your own DVFS in the first place? Didn't Skylake's self-regulating mode work well? (where you let the CPU's hardware power management controller make all the frequency decisions, instead of just choosing whether to turbo or not).
Setting the CPU core clock speeds should have little to no effect on the GPU's DVFS. It's in a separate domain, and not linked to any of the cores (which can each choose their clocks individually). As you can see on Wikipedia, that SKL model can scale its GPU clocks from 300MHz to 1050MHz, and is probably doing so automatically if you're using an OS running Intel's normal graphics drivers.
For more about how Skylake power management works under the hood, see Efraim Rotem's (Lead Client Power Architect) IDF2015 talk (audio+slides, very good stuff). The title is Skylake Deep Dive: A New Architecture to Manage Power Performance and Energy Efficiency.
There's a link to the list of IDF2015 sessions in the x86 tag wiki.

Parallelizeable jpeg like compression using only DCT, run length encoding stages, what sort of compression/performance possible?

We have to compress a ton o' (monochrome) image data and move it quickly. If one were to just use the parallelizeable stages of jpeg compression (DCT and run length encoding of the quantized results) and run it on a GPU so each block is compressed in parallel I am hoping that would be very fast and still yeild a very significant compression factor like full jpeg does.
Does anyone with more GPU / image compression experience have any idea how this would compare both compression and performance wise over using libjpeg on a CPU? (If it is a stupid idea, feel free to say so - I am extremely novice in my knowledge of cuda and the various stages of jpeg compression.) Certainly it will be less compression and hopefully(?) faster but I have no idea how significant those factors may be.
You could hardly get more compression in GPU - there are just no complex-enough algorithms which can use that MUCH power.
When working with simple alos like JPEG - it's so simple that you'll spend most of the time transferring data via PCI-E bus (which has significant latency, especially when card does not support DMA transfers).
Positive side is that if card have DMA, you can free up CPU for more important stuff, and get image compression "for free".
In the best case, you can get about 10x improvement on top-end GPU compared to top-end CPU provided that both CPU & GPU code is well-optimized.

Fastest real time decompression algorithm

I'm looking for an algorithm to decompress chunks of data (1k-30k) in real time with minimal overhead. Compression should preferably be fast but isn't as important as decompression speed.
From what I could gather LZO1X would be the fastest one. Have I missed anything? Ideally the algorithm is not under GPL.
lz4 is what you're looking for here.
LZ4 is lossless compression algorithm, providing compression speed at
400 MB/s per core, scalable with multi-cores CPU. It features an
extremely fast decoder, with speed in multiple GB/s per core,
typically reaching RAM speed limits on multi-core systems.
Try Google's Snappy.
Snappy is a compression/decompression library. It does not aim for maximum compression, or compatibility with any other compression library; instead, it aims for very high speeds and reasonable compression. For instance, compared to the fastest mode of zlib, Snappy is an order of magnitude faster for most inputs, but the resulting compressed files are anywhere from 20% to 100% bigger. On a single core of a Core i7 processor in 64-bit mode, Snappy compresses at about 250 MB/sec or more and decompresses at about 500 MB/sec or more.
When you cannot use GPL licensed code your choice is clear - zlib. Very permissive license, fast compression, fair compression ratio, very fast decompression, works everywhere and ported to every sane language.

How to make jpeg compression with lowest CPU usage?

I need to convert raw image data to jpeg.
But I don't need anything special in terms of best quality, or minumum size etc.
Only thing I need is minumum CPU usage.
I am new to jpeg compression.
Can you please advice about which parameters will have the lowest CPU usage while converting jpeg?
I would like to use IPP(intel performance library).
An example from IPP jpeg library would be great.
But any sample from any other jpeg library also will be apprecited.
And if you know any jpeg library which is more performant than IPP's jpeg library, please let me know.
Thanks in advance.
Regards.
Do you mean 'fastest'? As in 'uses the cpu for the least amount of time'?
If you just mean CPU load, then the best way to lower CPU usage (if you want to do something else at the same time), is to ask the operating system to downprioritize the program.

Resources