Friday 30 April 2010

The origins of music, part 2: Musilanguage

One of the major theories about the origins of music is built upon its strong similarity to language. Both share the purpose of communicating predominantly, though not exclusively, through the organisation of sound.

Steven Mithen has noted that language and music have three things in common: they may be vocal, they may be gestural, and they can be written down. Vocalisation must be one of music’s earliest forms, and it is highly gestural, e.g. people find it hard not to tap along to the rhythm. Writing and musical notation appeared in the early civilisations and are relatively late innovations.

To use the terminology of the neuroscientist Steven Brown, both music and language involve a limited repertoire of discrete building blocks, organised into phrases and higher-order structures using combinatorial rules.[1] Put simply, both organise individual acoustic elements using a kind of grammar, which can be built into larger structures such as novels or symphonies. Both also use expressive phrasing, where we modulate acoustic properties such as pitch and rhythm “for the purposes of conveying emphasis, emotional state, and emotive meaning.” In both music and language, certain modes of symbolic expression are the same across all cultures and intensify along with the emotion: happiness is expressed through faster, louder and higher music or speech, sadness by the opposite. Dean Falk, an anthropologist who specialises in the evolution of the brain, tells us that language and music are “neurologically intertwined”, and concludes that “they began evolving together by two million years ago”.[2]

Although similar, music and language are also distinct. Language conveys referential meaning, i.e. a symbolic content based upon the arbitrary association of words with specific referents. Music, on the other hand, has difficulty conveying such meaning – its emphasis is on the emotive, and it is therefore ‘manipulative’, moving our emotions and bodies. This is a divide of emphasis only. Language often uses onomatopoeia to emulate sounds, such as the English “miaow” or “tick tock”. One couldn’t easily devise music to convey the sense, “Please go and fetch the first cup on the left,” but music is not incapable of direct semantic representation. The imitation of a cuckoo in Beethoven’s sixth symphony is a famous example; or, in musical narrative, certain phrases can be associated with certain characters or concepts, as in the operatic use of leitmotif. Both language and music often use gesture, and language would not be so effective without the ‘musical’ changes of pitch, volume etc that add layers of meaning to speech.

Broadly, language and music are structurally similar forms of communication that use different systems: sound reference and sound emotion. It could be that their similarities are a pure accident, or that music grew out of language or vice versa. Brown prefers another option – that “music and language must converge at some deep level to have hierarchical organisation flower from two such different grammatical systems.”

Whereas language has a clear purpose in conveying information, the ‘use value’ of music is less obvious. Steven Pinker described music as “auditory cheesecake, an exquisite confection crafted to tickle the sensitive spots of at least six of our mental faculties” [3] – that is, a by-product of adaptations that evolved for other purposes. A slice of cake pleases our taste buds, not because cake was essential to our evolution, but because our ancestors evolved a liking for fat and sugar as excellent energy sources. Music would thus be what Gould called a ‘spandrel’: an accidental offshoot of more significant processes. This wouldn’t stop music being glorious, but it would mean that it was not an evolutionary adaptation.

An ancestral proto-language


Steven Brown explains the similarities by suggesting that music and language originate in a single, ancient form of vocal communication that he terms ‘musilanguage’: “an ancestral stage that was neither linguistic nor musical but that embodied the shared features of modern-day music and language.” Musilanguage combined sound as emotive meaning and as referential meaning. Only later in human development did music and language separate.

This is not a new idea. The Swiss philosopher Jean-Jacques Rousseau, for example, wrote in 1754 that “verse, singing and speech have a common origin... one spoke as much by natural sounds and rhythm as by articulations and words”.[4] Charles Darwin suggested it in 1871 in a context of sexual selection:

When we treat of sexual selection we shall see that primeval man, or rather some early progenitor of man, probably first used his voice in producing true musical cadences, that is in singing, as do some of the gibbon-apes at the present day; and we may conclude from a widely-spread analogy, that this power would have been especially exerted during the courtship of the sexes, – would have expressed various emotions, such as love, jealousy, triumph, – and would have served as a challenge to rivals. It is, therefore, probable that the imitation of musical cries by articulate sounds may have given rise to words expressive of various complex emotions.[5]

In the 1920s, the linguist Otto Jespersen also championed the idea. But it has only received consistent academic attention in the last couple of decades. In The Singing Neanderthals, one of the most prominent recent books on the origins of music, Mithen argues that music and language have common origins in a quasi-musical ‘proto-language’, i.e. a musilanguage (the term I shall keep to for simplicity). Drawing upon archaeology, genetics and linguistics, Mithen follows linguist Alison Wray in proposing that

the precursor to language was a communication system composed of ‘messages’ rather than words; each hominid utterance was uniquely associated with an arbitrary meaning... modern language only evolved when holistic utterances were ‘segmented’ to produce words which could then be composed together to create statements with novel meanings.[6]

This holistic musilanguage would have made “extensive use of variation in pitch, rhythm and melody to communicate information, express emotion and induce emotion in other individuals”. A modern equivalent to how this worked might be proverbial phrases like “don’t count your chickens before they’re hatched”, which we understand in a holistic way rather than according to their individual parts.

Steven Brown listed three essential features of musilanguage:
1. the use of pitch to convey semantic meaning
2. the creation of phrases by combining “lexical-tonal elements”, and
3. use of expressive phrasing to give emotional emphasis.

Mithen refers to musilanguage by the annoying acronym ‘Hmmmmm’, because it is holistic (made of complete meaningful phrases rather than discrete parts), manipulative (influencing our emotional states and behaviour), multimodal (using both sound and movement), musical (temporal, rhythmic, and melodic), and mimetic (making use of sound symbolism and gesture).

Pre-sapiens human species could have possessed such a musilanguage and thus had a musical capacity as well, but true language was probably limited to ourselves, so we may speculate that the separation of musilanguage into two distinct forms of communication was confined to our own species.

Mithen suggests we may hear an echo of musilanguage in the special speech we use for babies, the so-called “infant-directed speech” or “parentese”. This form of speech, which is universal in human cultures, doesn’t rely on proper words, but employs extended vowels, exaggerated pauses, and wider ranges of pitch to allow us to communicate with infants. It works because we are sensitive to the tempos and rhythms of speech long before we have learnt words (and we begin to hear before we even leave the womb). The exaggerated discourse of infant-directed speech probably helps children to pick up how words are formed and sentences constructed. But as it can communicate with infants regardless of the mother tongue of the person using it, it seems to be less about language as such and more about emotive and intentional communication through expressive phrasing. Babies can communicate non-verbally long before they can talk (and the importance of non-verbal sound stays with us into adulthood).

This idea is not Mithen’s. It was Helen Dissanayake who hypothesised that

it is in the evolution of affiliative interactions between mothers and infants – not male competition and adult courtship – that we can discover the origins of the competencies and sensitivities that gave rise to human music.[7]

We should point out that this interaction can be practiced by mothers and fathers, i.e. childcare is not a uniquely female destiny. Dissanayake’s idea is that human groups found the capacities encouraged in parent-infant interaction to be useful both emotionally and functionally in social rituals. Dean Falk has suggested something similar in her book Finding Our Tongues.

We don’t only talk with our infants – we sing to them too, to soothe them, reassure them of our presence, and to make them smile. The psychologist Sandra Trehub has demonstrated that lullabies, like infant-directed speech, sound remarkably similar across cultures. The universality of baby talk and lullabies, which are used instinctively by adults, strongly implies a genetic aspect to both. Mithen’s comment is insightful:

Those who use facial expressions, gestures and utterances to stimulate and communicate with their babies are effectively moulding the infants’ brains into the appropriate shape to become effective members of human communities.

So musilanguage may have conferred an evolutionary advantage through improved socialisation.

Beginnings


Drumming behaviour in chimpanzees, bonobos and gorillas – on the ground, on their chest, or on objects like trees – may represent a distant connection to our own past through a common ancestor some 7–8 million years ago.

Great apes use barks, hoots, screams and other noises to communicate all sorts of messages – what members of the group are up to, the approach of strangers, competitive displays, etc – and sometimes accompany their vocalisations with vigorous movements like shaking branches or stamping. Such behaviours are manipulative, in the sense of trying to get a response from other apes, rather than referential, i.e. there is nothing that resembles words. They are possibly the distant precursors of music, language and dance.

John S. Allen has pointed out:

If we consider the various functions that form the basis of music’s claim to be an adaptation – courtship display, emotional communication, synchronising group behaviour, a mnemonic device for carrying information, and so on – all of these functions are at least as well served by language as by music, which would certainly limit the potential fitness benefits that might accrue with musical expertise. So in order for music to be considered an adaptation in these domains, it would have to have enhanced prelinguistic expression in earlier hominids.[8]

The differences of anatomy between the hominids and other apes as a result of our shift to meat-eating, such as reduced size of the teeth and jaws, allowed us to increase our vocal range. This capacity was also affected by bipedality, which lowered our larynx as well as expanding our brain and nervous system. Increasing group size probably demanded an increase in the quality and quantity of calls. Drawing on Robert Dunbar’s research on the development of language through vocal instead of physical grooming, Mithen speculates that early hominids ‘sang’ to each other, reinforcing their social relationships:

One might have heard predator alarm calls; calls relating to food availability and requests for help with butchery; mother-infant communications; the sounds of pairs and small groups maintaining their social bonds by communicating with melodic calls; and the vocalisations of individuals expressing particular emotions and seeking to induce them in others.[9]

Mithen then imagines the whole group, at the close of the day, engaged in group song.

Musilanguage would have arisen among early hominids and died out with the Neanderthals. It is worth stressing that true music is a uniquely human activity. Tracing music’s lineage back to great ape communication does not mean that there is anything musical about that communication, only that it provided a foundation for a human behaviour. Some animal species are capable of very complex vocalisations which to us sound musical (the obvious example being songbirds). But we must be cautious about making lazy, anthropocentric connections. Apes’ capacity for vocal learning is much poorer than our own. Humans are separated from chimps, our nearest surviving ape ancestors, by six million years of evolution, and from birds, some of whose vocal learning is considerably better than chimps’, by far longer again. It is possible to teach a degree of symbolic behaviour to apes, but only under controlled conditions. Animals miss the vital ingredient of creative self-awareness, and so their communication does not contain the rich meanings of human signs, or break out of instinctive patterns. Recognising a phylogenetic aspect to music is one thing; assuming a gradualist development from animal behaviours that ignores the uniqueness of humans is another.

Mimesis


Mithen finds another possible source of musical behaviours, for which he consults the work of the evolutionary psychologist Merlin Donald. Early human species may have possessed a culture in-between that of apes and modern humans, what Donald called “mimetic”. Mimesis, or imitation, would in this context have been deliberate representational behaviour that didn’t use words, which may have played an important role in communicating encounters with new animals and environments as humans spread out of their traditional African homelands.

The earliest attempts at what became music may have been people observing and trying to imitate the sounds of the natural world – as practised by surviving hunter-gatherer societies – or of human labour, such as the rhythmic knapping of flints. These imitations, predominantly created using the voice, could then be repeated, structured, combined, and added to social rituals such as attaining adulthood, burial and so on. Through imitation, early humans could “create new types of tools, colonise new landscapes, use fire and engage in big-game hunting” (Mithen).

Such mimetic behaviour would have operated as holistic, self-contained messages. Hence the title of Mithen’s book – the Neanderthals, although not possessing true language, would have had a rich repertoire of holistic phrases, and therefore some measure of musicality.

Once humans had the ability to vocally imitate the sounds of nature and their own activities, and accompany these vocal signs with gestural ones, we had the seeds of language. This would often have been onomatopoeic: a buzzing noise intended to imitate a bee, for example, could become a word which meant ‘bee’. Another means of language development is ‘sound synaesthesia’, a more precise term for ‘sound symbolism’, wherein the sound not only mimics the animal’s own calls, but tries to capture something of its nature. An example is the use of long sounds like ‘oh’ and ‘ah’ to represent a large, heavy animal. Such sounds could be accompanied by gestures, or mime. This process proceeded in a dialectical spiral, the physiological and lexical assisting one another.

Mimesis is one of the most persuasive theories for the beginnings of music, potent whether or not one subscribes to the existence of an ancestral musilanguage.

From musilanguage to language and music


One of the challenges for proponents of musilanguage is to explain how it made the leap to modern music and language, and when in human history that took place.

According to Brown, the division arose because musilanguage had two aspects that gradually became separate specialisms. The first, symbolic communication emphasising referential meaning, became language. The other, emphasising emotional communication, became music.

Explaining how individual words, or sounds, came to be organised into phrases – i.e. grammar, both lexical and musical – and into extended structures is difficult. Proponents of proto-languages, such as Wray and Arbib, have suggested that holistic phrases could have been gradually broken up into elements of meaning, giving us sentences and then nouns, verbs etc. This could have begun with the recognition of chance associations between particular bits of holistic phrasing and their referents: if a particular phoneme appeared in two phrases which both made some reference to a deer, that phoneme could be held up as an arbitrary label for ‘deer’. Instead of developing individual words and only later a grammar to stick them together with, early humans slowly broke down ‘sung’ phrases into discrete parcels with a precise meaning. The isolation of individual musical notes, together with ways of stitching them together with a musical grammar, could have followed the same course. Freed by language of the need to relay information, musilanguage could become music, concentrating upon communicating emotion and reinforcing group identity.

When precisely this split happened is an open question, but it must have been complete [10] by the time of the flowering of art 50,000 years ago. Going by the archaeological evidence of actual musical instruments, true music, despite its beginnings among earlier hominids, matures for certain only with Homo sapiens.

“Discrete words, that can be combined to make new and unique utterances,” Mithen concludes, “were a relatively late development in the evolutionary process that led to language.”

Music emerged from the remnants of ‘Hmmmmm’ after language evolved. Compositional, referential language took over the role of information exchange so completely that ‘Hmmmmm’ became a communication system almost entirely concerned with the expression of emotion and the forging of group identities, tasks at which language is relatively ineffective. Indeed, having been relieved of the need to transmit and manipulate information, ‘Hmmmmm’ could specialise in these roles and was free to evolve into the communication system that we now call music. As the language-using modern humans were able to invent complex instruments, the capabilities of the human body became extended and elaborated, providing a host of new possibilities for musical sound.[11]

These possibilities were indeed immense, because music evolved as a form of communication. How this communication is made and how it is used can take an infinite number of forms: from handclapping in the playground to an orchestra trumpeting the destiny of humankind.

Conclusion


Mithen sharply disagrees with Pinker’s assessment of music as a non-adaptive ‘extra’. Emotions, he argues, are deeply rooted in our evolutionary past and in our physiology. Why would music affect them so strongly if it was a recent and superficial innovation? In his view the development of music from musilanguage did have adaptive value, enabling communication with infants and group bonding. These behaviours were firmly established with Homo sapiens before our migration from Africa, which explains their universality today.

Mithen’s book tends to make speculative claims without adequate evidence. He says for example that the development of language meant a certain loss of musicality, but presents no strong evidence of this. If modern musical practice often diverges from traditional collective and participatory models, this is not the fault of language but of capitalism, which has commodified, individualised and technologised music-making, dividing it up into categories and specialisations. In many non-Western, and indeed Western contexts, human beings continue to participate in music spontaneously and collectively much as our ancestors did. His conclusion that the Neanderthals must have been at some level more musical than ourselves seems completely unprovable.[12]

Another shortcoming of Mithen’s book is his commitment to modularity in the brain. Mithen himself notes that the way music processing is distributed through the brain shows that there cannot be a single ‘music module’, so he then has to suggest an increasing number of different modules dealing with different aspects of music. Brain scans have shown that the neural processing of music takes place in many different areas. Mithen acknowledges this, suggesting that modularity does not have to be limited to discrete areas, but he is unable to prove the location or existence of modules.

These shortcomings don’t prove wrong the theory of musilanguage, which remains one of the most serious attempts to solve the mystery of music’s origins. We will consider some of the other adaptive theories in the next post.



[1] Steven Brown, ‘The Musilanguage Model of Language Evolution’, from Wallin, Merker and Brown (eds), The Origins of Music (2000).
[2] Dean Falk, ‘Hominid Brain Evolution and the Origin of Music’, from Wallin et al, op. cit.
[3] Steven Pinker, How the Mind Works (1997).
[4] Rousseau, Essay On the Origin of Languages (written 1754, pub. posthumously in 1781).
[5] Darwin, Chapter 3 of The Descent of Man, and Selection in Relation to Sex (1871). For a discussion of Darwin’s ideas see W. Tecumseh Fitch, ‘Musical protolanguage: Darwin’s theory of language evolution revisited’ from Language Log (2009).
[6] Steven Mithen, The Singing Neanderthals (2005).
[7] Ellen Dissanayake, ‘Antecedents of the temporal arts in early mother-infant interaction’, from Wallin, Merker and Brown (eds), The Origins of Music (2000).
[8] John S. Allen, The Lives of the Brain: Human Evolution and the Organ of Mind (2009).
[9] Mithen, op. cit.
[10] Though of course musical elements live on in language, via holistic idioms, infant-directed speech, and the persistence of onomatopeoia, sound synaesthesia etc.
[11] Mithen, op. cit.
[12] To be fair it is one of the benefits of popular science writing, as opposed to strictly scientific papers, that one may engage in creative speculation of this kind.

Monday 19 April 2010

The origins of music, part 1

“Music is a strange thing, I would almost say it is a miracle.”
– Heinrich Heine

Most musicologists agree that music – like dance, its close companion – is universal in human cultures. All human beings, apart from those with unusual cognitive deficits, perceive pitch, rhythm, harmony, timbre etc, process them as music in various parts of our brains, and experience emotional responses. This seems to be true at least as far back as 35,000 years ago, when examples of musical instruments from the Upper Paleolithic provide the first unequivocal evidence of music-making. Given the sophistication of those examples, the history of instruments probably extends back even earlier, implying that music has always been practiced by anatomically and behaviourally modern human beings. It is also conceivable that musicality existed in some form amongst pre-sapiens human species. So music is both ubiquitous and ancient.

All existing human societies have music – whereas not all have literacy. Parents and carers the world over sing to babies, who are sensitive from birth to tonal variation and rhythm. We use music to motivate armies, to marry, to bury the dead, to worship deities and for sheer fun; we listen in groups and alone; it can make us happy or sad. Although music takes an infinite variety of cultural forms, key elements like rhythm and repetition are common to all, and even people who consider themselves ‘unmusical’ can enjoy and participate in it. Enjoyment of music has no national boundaries, meaning that whereas an untranslated poem will be incomprehensible to millions of people, a piece of music may be appreciated by anyone, and this remains true even of music from completely different periods and cultures to one’s own. In 1973 the British musicologist John Blacking put this well:

Music can transcend time and culture. Music that was exciting to the contemporaries of Mozart and Beethoven is still exciting, although we do not share their culture. The early Beatles’ songs are still exciting although the Beatles have unfortunately broken up. Similarly, some Venda songs that must have been composed hundreds of years ago still excite me. Many of us are thrilled by Koto music from Japan, sitar music from India, Chopi xylophone music, and so on… I am convinced that the explanation for this is to be found in the fact that at the level of deep structures in music there are elements that are common to the human psyche, although they may not appear in the surface structures.[1]

Despite music’s extraordinary universality and its evident importance in human culture (including its part in an entertainment industry worth billions of dollars), research into its origins and evolution has been relatively scarce. Only in the last decade or two, which have seen a surge of interest in human cognition, have scientists begun to study it consistently.

Has music existed from the beginnings of Homo sapiens, or in earlier human species? What was it for? How does it relate to the development of language and other capacities? Did it give some kind of evolutionary advantage? These are not easy questions, and I will be offering no firm answers, as music’s origins are still opaque. Our theories can be difficult to test: like language, music does not fossilise.

Even defining music is not straightforward. Trying to do it in terms of particular styles or instruments, for example, is an obvious dead end. All cultures use rhythm, vocals and melody, together with bodily movements, in order to organise sound, but musical practice varies enormously across cultures. There is always an exceptional case – such as John Cage’s 4’ 33”, a piece consisting of four minutes and thirty-three seconds of silence played by any combination of instruments – to confound definitions [2]. The best we can do is focus on what most music tends to be like, and recognise that not all ‘universal’ musical traits will appear in all music. Although we know of no society that did not have music in practice, a concept of music is not universal, and some cultures (such as the Ewe people of Western Africa) do not even have a word for it, or at least for what the West means by it.

Our understanding of music is further complicated by the subjectivity of our response to it. Some people are obsessed by music, or by one specific form of it, whereas others have no taste for it at all. Every human being will respond to a piece of music in different ways, and furthermore, a person’s reactions to the same piece can vary from one day to the next.

One of music’s most important abilities is arousing emotions in the people who hear it. We shall consider this elsewhere, because we are interested in aesthetically aroused emotion as a topic pertinent to all art forms.

The biology of music


The universality of music suggests that our capacity to create and enjoy it is to some extent genetic, so in our search for its origins we need to draw upon anthropology and evolutionary theory. If we can identify the physiological capacities necessary for making music, we may be able to discover when humans first became capable of it, and gain other insights into its nature.

Like any art, music is not only produced but is also perceived by an audience. (The two cannot be separated, as we only make art that we know can be perceived.) Let us begin with production.

We know from the tool record that human species long before Homo sapiens had the dexterity to hit two objects together in a way that would function as percussion. But even this does not represent the earliest potential limit for musicality. Part of the musical skill set amongst modern hunter-gatherers requires purely bodily actions such as body-slapping, hand-clapping, stamping and so on, which require only basic primate physiology.

One of the first musical skills, predating the creation of instruments, must have been some form of singing. Singing is almost impossible to trace via the fossil record, but for vocalisations in general we have more evidence. Many animals use vocalisations to serve various functions: courtship, territorialism, aggression, and so on. Some species, such as parrots and bats, can even learn vocal [3] behaviours instead of merely inheriting them. Our hominid ancestors must also have had a repertoire of vocalisations, gestures and other behaviours which, following an increase in general intelligence from Australopithecus or early Homo, provided the foundation for more complex communication later. Humans can produce far more sounds than other apes – partly because of the construction of their vocal system, e.g. the descended larynx which extended the vocal tract and increased the range of sounds we could make, and partly because of our gift for vocal learning and imitation.

It is possible that a vocal apparatus capable of chanting and singing dates as early as Homo ergaster (1.8–1.3 million years ago), though the evolutionary biologist W. Tecumseh Fitch has argued that singing requires more precise control and breathing capacity even than normal speech, which would date true singing much later, perhaps to the split between the Neanderthal and modern human lineages. [4] It is likely that Neanderthals had a vocal apparatus much like ours: amongst other similarities, a Neanderthal hyoid bone from 60,000 years ago is similar to that of humans, and their larynx was low in the throat like our own.[5]

As for the perception of music, this depends upon our auditory and cognitive abilities, themselves a product of the evolutionary process. The human ear has thousands of nerve endings that send signals to the brain, and it is sensitive to a great range of frequencies and dynamics. It converts sounds into neural messages which are interpreted by parts of the brain in terms of pitch, rhythm, melody, timbre, time, and so on. This can provoke a variety of physiological and emotional responses such as foot-tapping, swaying, weeping and the rest. Processing music requires some brain specialisation and a division of labour between the hemispheres.

It seems from the study of infants that our perception of music is innate. Despite certain difficulties (not least infants’ inability to tell us what they are experiencing), research such as that of psychologist Sandra Trehub has shown that infants can recognise differences in tone, melody, key and rhythm, sometimes better even than adults. Patricia Kuhl, a researcher in language and brain development, found that babies have an “exquisite sensitivity” to speech, preferring to listen to ‘parentese’ [6], the lulling half-sung speech used by parents with infants. As this sensitivity occurs even before language, it is possible that the mechanisms involved are very ancient, even predating our species.

The range of frequencies perceived by the ear approximates to the range of frequencies of our vocalisations (making our auditory system very well suited to perceiving singing). It is probable that our vocal and auditory abilities evolved in a close relationship with one another.

On one level our musicality can be reduced to electrical contacts firing through some of the hundred billion of the brain’s neurons, or nerve cells. We know that parts of the brain are responsible for different functions: two sections named Broca’s area and Wernicke’s area for example are essential for language, and defects in or injuries to the brain can damage people’s ability to understand language (aphasia) or music (amusia).

But aspects of music such as rhythm, melody and pitch seem to be processed in different areas, and there is no evidence that there is any one ‘music area’ of the brain. Consider the totality of a musician playing a tune on an instrument: the process requires not just neurological activity relating to melody but also memory (to remember what fingering is required to get the desired sound), motor skills, listening to and assessing the sounds one is making, breath control, social awareness of the audience’s reaction, etc. The brain’s functions and physical spaces – whether or not one believes it is modular – overlap in complex ways that are poorly understood. The reason there is such a variety of cognitive deficits is because these functions can go wrong in different places, and it is possible that the brain sees the aspects of music as specific examples of more general cognitive tasks rather than identifying it as one musical process. Any hunt for a ‘music gene’ will, like all genetic determinism, be in vain.

The limits of biology


There is certainly a biological aspect to music (without auditory organs, we wouldn’t be able to listen to it). But having a physiological equipment capable of making and perceiving music is one thing, actually using it for that purpose is another. Evidence from Blombos Cave and elsewhere suggests that we had the capacity for art before the emigration from Africa – though unequivocal evidence for music does not exist prior to the appearance of musical instruments in the Upper Paleolithic – but the flowering of art seems to have come thousands of years later and was probably driven by culture. Tracking only the physiological capacity for music is clearly not going to be enough: we must also ask when, how and why we began to use it.

A crow’s syrinx is no less developed than a nightingale’s, but its croak hardly compares to the latter’s song; nearer to home, Neanderthals’ vocal equipment may well have been similar to ours, but there’s no unequivocal evidence that they were at all musical. Despite their physiological similarity to ourselves, earlier human species may not have created music as we understand it, because they are separated from us by a qualitative leap of consciousness. As Marx observed: “It is obvious that the human eye enjoys things in a way different from the crude, non-human eye; the human ear different from the crude ear, etc.”[7] By this he meant that sights and sounds are experienced differently by an animal to how they are experienced by a self-aware human being. Humans are tool-making, creative, universal beings. Although humans are not strictly unique in making tools, having culture, etc, the reality is that in the sum of our creative and intellectual powers we are unique among animal species. This is why birdsong, the most obvious instance of ‘musicality’ in animals, cannot be considered to be music, however musical it may sound to humans. The behaviours of animals are instinctive and functional, not creative in any way comparable to ours.

Also, of course, music always takes a concrete form mediated by the social context lived in by the musicians. The biological aspect of music doesn’t explain why particular groups or individuals produced particular ‘works’. To ignore this wider context is to distort our understanding and turn music into an abstraction. This is what allowed evolutionary psychologist Geoffrey Miller to draw conclusions like this:

I took random samples of... jazz albums... rock albums... and classical music works... Males produced ten times as much music as females, and their musical output peaked in young adulthood, around age thirty, near the time of peak mating effort... [This suggests] that music evolved and continues to function as a courtship display, mostly broadcast by young males to attract females.[8]

Miller’s view is that music, which makes great demands on our time and energy, must have evolved as an adaptation, specifically as a way of finding a mate. It is true that complex sound displays in other animals, such as birds, are related to sexual selection, so it is not unreasonable to explore this in humans. Darwin originally proposed the idea, in The Descent of Man, writing:

it appears probable that the progenitors of man, either the males or females or both sexes, before acquiring the power of expressing their mutual love in articulate language, endeavoured to charm each other with musical notes and rhythm.[9] 

No doubt plenty of males, as in Miller’s account, have succeeded in winning female attention with their skill in music. But Miller’s reduction of music to a heterosexual courtship display, with reference to a narrow sample of modern Western practice (e.g. the promiscuity of Jimi Hendrix), is not adequate to explain most of the music created across human cultures. There is a huge qualitative difference between human music-making and the courtship calls of animals, however fascinating and complex the latter can be. It is also ignorant to ignore the sexism that has excluded women from cultural production.

In my view sexual selection is an unlikely explanation of music for three reasons. Firstly, sexual selection in animals is usually accompanied by sexual dimorphism, i.e. a divergence of form or capacity between the sexes. Musical performance for sexual selection in animals is exclusively performed by males, but in humans, males and females are born with equal musical ability. Secondly, musicality exists from birth, whereas human sexual characteristics generally come into play later in childhood and adolescence. Thirdly, the great majority of music is created by groups involving the whole community, not least to encourage collective identity – not for individual sexual display. Poor examples of evolutionary psychology do not necessarily discredit the discipline in principle, but Miller’s approach serves more to ‘prove’ reactionary social stereotypes than to take science forward.

Opinions on the origins of music tend to divide into two camps: one seeing it as an evolutionary adaptation, principally serving natural selection (e.g. sexual selection as with Miller), and one seeing it as a product of other cognitive adaptations and less adaptively important in itself (e.g. Steven Pinker). It is however possible to consider it as a mixture of both. With this context in mind, our next post will look at one of the most popular evolutionary theories: ‘musilanguage’.


[1] John Blacking, How Musical is Man? (1973).
[2] This work by Cage should nonetheless be considered music because it is organised sound, albeit one where the sound is playfully silenced, within a tradition of comparable sound performances (i.e. classical concerts).
[3] ‘Vocal’ is distinct from ‘verbal’, which implies language.
[4] W. Tecumseh Fitch, ‘The Evolution of Music in Comparative Perspective’ (2005).
[5] See for example d’Errico et al, ‘Archaeological Evidence for the Emergence of Language, Symbolism, and Music – An Alternative Multidisciplinary Perspective’, Journal of World Prehistory (2003).
[6] Sometimes called ‘baby talk’, ‘infant-directed speech’ or ‘motherese’. I don’t recommend the latter term, as it is sexist in its implication that communicating with infants is a domain for women.
[7] Karl Marx, ‘Private Property and Communism’, Third Manuscript, 1844 Manuscripts (1844).
[8] Geoffrey Miller, ‘Evolution of Human Music Through Natural Selection’, from Wallin, Merker and Brown (eds), The Origins of Music (2000).
[9] Darwin, Chapter XIX of The Descent of Man, and Selection in Relation to Sex (1871).