Magical World of Parallel Computing

Posts

Showing posts with the label performance

A practical comparision of multitasking libraries

The following code-base attempts to compare multiple libraries on a simple experiment of Algebraic operations. The results are in favour of certain libraries because they are more natural in he application of Compute bound tasks, whereas the others are better at I/O bound tasks. Threading in Python is considered broken... I've also refrained from using the MPI library of C. I've made a driver.py program which create 5 sets of test-cases of increasing size to test the excution times of each version of each program. The comparison is across different libraries and languages, with the following programs: C : optimal serial code openMP directive based parallelism pthread library Python : optimal serial pyMP Python multiprocessing module Note that the gcc compiler automatically vectorizes the addition of arrays x and y using SIMD compliant hardware and appropriate data-types. I'll be adding the GPU comparison as soon as I can! Currently I don't h

EnchantingProgram: Spoiler alert

This is part-2 of the "EnchantingProgram" post, read this post first: http://magical-parallel-computing.blogspot.in/2017/04/a-simple-python-program-using.html So, let's see the actual reason of the speedup in the C and C++ programs. Lo and Behold, it is the effect of Branch Prediction ! Surprised? Well, at least my comments in the programs should have given you some direction!!! The if condition leads to a branch in the control flow. We know that branch predictions lead to pipeline flushes and create a delay in the piped execution scheme. Modern microprocessors utilize complex run-time systems for throughput, execution speed and memory efficiency. One example of such a technique is dynamic branch prediction. A long time ago, microprocessors used only a basic technique called static branch prediction, with two general rules: A forward branch is presumed to be not taken A backward branch is presumed to be taken Now, static branch p

A simple python program using multiprocessing... or is it?

I would like to show you a very simple, yet subtle example on how programs can seem to produce unreasonable outputs. Recently, I was glancing through certain programs in Python, searching for places to optimize code and induce parallelism. I started thinking of threads immediately, and how independent contexts of computation can speed up code. Although I program frequently with Python, I hadn't been using any kind of explicit parallelism in my code. So, using my C knowledge, I went towards the Threading library of Python. Long story short, that was a mistake! Turns out that the Python implementation which is distributed by default (CPython) and Pypy, both have a process-wide mutex called the Global Interpreter Lock. This lock is necessary mainly because CPython's memory management is not thread-safe. The GIL locks up any kind of concurrent access to any objects in the Python run-time to prevent any race conditions or corruption of states. This is effectively a synchroniz

Hi there!

The magical world of parallel computing awaits you! We all know Serial computing using our favourite mainstream programming language( C, C++, Java, Python ) on a single faithful CPU with a module of RAM of a standard PC, one machine instruction after another. (Don't fret even if you don't! We'll discuss concepts gradually transitioning from serial processing to concurrency to parallelism and beyond!) Now, suppose we have a 2.0 GHz single core CPU, so it is capable of about 2 billion instructions per second. Compare that to Human calculation time, about 2 seconds per instruction !!! (assume simple addition or relocation instruction) What if we want more than that? Just visit the closest computer hardware store and buy another PC with: n processors with m cores, x.yz GHz Superscalar architecture, Hyper threading, etc r GB GDDR5 RAM and s GB SSD Integrated Graphics and Discrete Graphics card with abcd GPGPU cores (Wauw, that is possibly going to