CONCURR AND CONQUER
There has been a paradigm shift from the way computers are programmed
to do what we wanted to do. There is nothing more entertaining then
the beauty of the electron. Right from the moment you hit the enter on
the keyboard, to when the compiler generates the object files, the
linker links it up with the libraries and other objects, the assembler
churns out the assembly language for the native machine, loader puts
in the binary image of the 0s and the 1s into the memory, the sequence
of bits is interpreted as the micro-codes in the processor registers,
and life flows into the flip flops and the transistors, the drift
velocity of the electrons carrying them to and fro across the
channels, turning the silicon on and off.
This view can no longer be localized into just one processor. We dont
need embarassingly parallel problems to justify parallel architectures
or vice versa.
In my opinion, there has been no better mother of change, then the
need of managing complexity. The order of growth of complexity has
driven the IT revolution. Procedural to functional to Object oriented,
all due to one basic need of managing complexity. Flat to network to
relational to object oriented databases, again same reason. This is
taking an amortized view, mind that. If you go into intricacies of the
baud, it stops at the bits and the electrons, if you take an aggregate
view, it zooms in towards the white light shining far beyond the
matrix of the human race.
Coming back to what I was talking about earlier, there is a difference
in concurrency and parallelism. Concurrency is when two events can
occur in any order without affecting the outcome of the execution.
Parallelism is when these events occur on different processors. These
processors might share a common memory and clock, or they might be
loosely coupled and not share a common bus and a clock. As a
programmer, I would always program concurrently, and know that the
program, if run on a parallel architecture would exploit parallelism
as well, and get me more bang for the buck for my multi-threaded
application.
When do you thread your code?
1. Independent Tasks : When you can have tasks in your system, which
run independent of each other, the order of their execution doesnt
affect the outcome.
2. I/O use: When some tasks are I/O intensive, you dont want them to
hog up the system resources doing nothing, while other tasks could use
that time.
3. CPU use: when some tasks are CPU intensive, you dont want them to
use the CPU always, and starve other tasks.
4. Asynchronous event handling: Some task is expected to receive
service requests or signals asynchronously, I had rather thread that
task then make it busy wait or poll for the event to fire.
5. Resource sharing : As opposite it may sound to the first goal of
independent tasks, threading can enable tasks to share the global
process wide state of text and read only data, while maintaining their
own private execution states in the stacks and register sets.
Communication and Syncrhonization can be facilitated via message
passing, shared memory or other IPC mechanisms. This leads to an
important form of parallel execution: Pipelining. Think of it as a
pipelining in software, pipelining in thought, in the mind.
6. Existing multi-process code: As a rule of the thumb, if you already
have a multi-process code, you would most of the time benefit by
making it multi-threaded. Threads are light weight componenets
compared to processes. It has two major advantages: cost in context
switch is low, and also more work is done in user space then in kernel
space.
7. Concurrency inducive class of problems: As I talked of the
embarassingly parallel problems earlier, there are various problems
which are more easily, naturally and conducively solved via concurrent
thinking and algorithms. For eg: matrix multiplication.
...and now I am tired and very sleepy, will add to this list
soon, if you have suggestions, mail me at nishant@purecode.us