Quite
rightly, biodiversity is in the news a lot, but it is still greatly
misunderstood. An important reason is that, as with 'information' we
have one single word (actually a portamento term: biological diversity)
to do many jobs. The term has a political significance where it
summarises the living natural environment, especially represented by
particular charismatic species, or habitats, especially if they are
under threat of destruction. In its political / common language usage,
biodiversity means both the significant species and the habitats in
which they live; it might even mean beautiful landscapes. Certainly if
you ask people on the street what biodiversity means to them (I have
tried this), many will say things like "species like tigers and
elephants that are going extinct" or "forests and butterflies etc.".
Well, I am not going to criticise this, because I think it has an
important role in representing the living natural world as people
experience and think of it. All that is important.
"The variability among living organisms from all sources including,
terrestrial, marine and other aquatic ecosystems and the ecological
complexes of which they are part; this includes diversity within
species, between species and of ecosystems." CBD Article 2. UNEP
1992.
But scientists who want to measure the diversity of life and understand
what determines it and what it affects, they need to have a more
precise definition that is quantitative and unambiguous.
Biodiversity as total Difference
Diversity has become one of those politically loaded words. It is the
legal and moral obligation of organisations in many countries of the
world to meet a minimum standard of 'diversity' in their workforce.
Many organisations say that it is in their best interests to build and
maintain a 'diversity of talents' and policy makers might claim to
consider a 'diversity of views'. In every case, the word 'diversity' is
being used as a sort of euphemism for differences that matter. The
reason for the rather tangential language is that for differences to be
identified, as a requirement for quantifying them, people have to be
categorised and that is considered impolite in modern culture.
The Statistical idea of Biodiversity
Imagine a bag of objects which you can
take out one by one and place on the table: what you take out will be a
sample of the objects in the bag. Let’s say they differ in colour,
shape and size. So your first object might be a large round blue ball
and the next might be a small green cube, etc. The sample will contain
a set of objects that can differ in three different ways: each of these
is an axis of diversity. If the sample contained just two objects, then
there can be at most three different ways in which its component parts
differ (one for each pair-wise comparison, for each category of
difference).
How
much information is needed to describe the two-object sample? The
answer is three bits (one for each difference). The information content
of the sample (in these terms only) is identical, it is three bits.
This information content escalates rapidly as we take a larger sample.
Suppose there are at least three colours, at least three shapes and at
least three sizes. Then with a sample of 3 objects there can be as many
as nine differences among the objects, and as few as none (if all the
objects extracted from the bag are the same). The probability of the
number of differences in the sample depends on the number of categories
defined for each dimension of diversity: this is termed the number of
levels. Notice now that we are describing a situation that is
mathematically equivalent to the one used to explain thermodynamic
entropy in terms of the number of ways of arranging the state of the
system - termed the multiplicity - which is expressed as via a
probability distribution.
More
straight forwardly, we can think about what the different dimensions of
biodiversity are and what levels should apply to them. The simplest
approach is to describe what is physically present at some definite
scale of organisation. For example, among n organisms there are (n^2
-n)/ 2 differences at the organism level (comparing each possible
unique pairing). If we cluster the organisms into their taxonomic
classes and find three distinct classes, then we find 3 differences
among them (A-B; A-C; B-C). On the other hand, if we look at the more
fundamental genetic scale, with say 12 organisms, we may for example
find all the organisms share 50% of genes in common (so no difference
there), and, to keep it simple, let the remaining half of genes all be
unique to their organisms. With each organism having a genome of 2000
genes (it’s just an illustration), that means there are 12000 unique
genes in the assembly of organisms, so there are 71994000 differences
within the total genetic pool. Now, how much information is there in k
differences? The answer is well explained by a short story, found in
Zernike (1972) and a little updated here.