DO AIs DREAM IN MIDI: “LOCK-IN IN SOFTWARE”

Ric Amurrio
19 min readMar 1, 2019

--

EPISODE 31 MUSIC IN PHASE SPACE

I think where people go wrong in imagining post-capitalist economies is starting with values. The stacking order is technology → economics → values. You need to start with alternative technological principles. Example: design with degradation/aging as a feature not bug.

For eg: You can’t impose “sustainability” as a bolted-on value onto an economy that is based on maximally valuing pristine newness as the default. Imagine an economy where most things peak in quality/value 30% into their lifespan, like living things. Digital stuff is that way.

Venkatesh Rao

Natural instruments VS ELECTRONIC INSTRUMENTS

Natural instruments — that is, acoustic instruments made out of real- world materials such as metal and wood — tend to produce energy at several frequencies at once because of the way the internal structure of their molecules vibrates. Suppose that I invent an instrument that, unlike any natural instruments we know of, produces energy at one, and only one, frequency. Let’s call this hypothetical instrument a generator (because it can generate tones of specific frequencies). If I line up a bunch of generators, I could set each one of them to play a specific frequency corresponding to the overtone series for a particular instrument playing a particular tone. I could have a bank of these generators making sounds at 110, 220, 330, 440, 550, and 660 Hz, which would give the listener the impression of a 110 Hz tone played by a musical instrument. Furthermore, I could control the amplitude of each of my generators and make each of the tones play at a particular loudness, corresponding to the overtone profile of a natural musical instrument. If I did that, the resulting bank of generators would approximate the sound of a clarinet, or flute, or any other instrument I was trying to emulate.

Additive synthesis such as the above approach achieves a synthetic version of a musical instrument timbre by adding together elemental sonic components of the sound. Many pipe organs, such as those found in churches, have a feature that will let you play around with this. On most pipe organs you press a key (or a pedal), which sends a blast of air through a metal pipe. The organ is constructed of hundreds of pipes of different sizes, and each one produces a different pitch, corresponding to its size, when air is shot through it; you can think of them as mechanical flutes, in which the air is supplied by an electric motor rather than by a person blowing. The sound that we associate with a church organ — its particular timbre — is a function of there being energy at several different frequencies at once, just as with other instruments. Each pipe of the organ produces an overtone series, and when you press a key on the organ keyboard, a column of air is blasted through more than one pipe at a time, giving a very rich spectrum of sounds.

These supplementary pipes, in addition to the one that vibrates at the fundamental frequency of the tone you’re trying to play, either produce tones that are integer multiples of the fundamental frequency, or are closely related to it mathematically and harmonically.

The organ player typically has control over which of these supple- mentary pipes he wants to blow air through by pulling and pushing levers, or drawbars, that direct the flow of air. Knowing that clarinets have a lot of energy in the odd harmonics of the overtone series, a clever organ player could simulate the sound of a clarinet by manipulating drawbars in such a way as to re-create the overtone series of that instrument. A little bit of 220 Hz here, a dash of 330 Hz, a dollop of 440 Hz, a heaping helping of 550 Hz, and voilà! — you’ve cooked yourself up a reasonable facsimile of an instrument.

Electronics

Starting in the late 1950s, scientists began experimenting with building such synthesis capabilities into smaller, more compact electronic devices, creating a family of new musical instruments known collectively as synthesizers. By the 1960s, synthesizers could be heard on records by the Beatles (on “Here Comes the Sun” and “Maxwell’s Silver Hammer”) and Walter/Wendy Carlos (Switched-On Bach), followed by groups who sculpted their sound around the synthesizer, such as Pink Floyd and Soft Machine.

Many of these synthesizers used additive synthesis as I’ve described it here, and later ones used more complex algorithms such as wave guide synthesis (invented by Julius Smith at Stanford) and FM synthesis (invented by John Chowning at Stanford). But merely copying the overtone profile, while it can create a sound reminiscent of the actual instrument, yields a rather pale copy. There is more to timbre than just the overtone series. Researchers still argue about what this “more” is, but it is generally accepted that, in addition to the overtone profile, timbre is defined by two other attributes that give rise to a perceptual difference from one instrument to another: attack and flux.

Chowning noticed that changing the frequency of these waves as they were playing created sounds that were musical. By controlling these parameters just so, he was able to simulate the sounds of a number of musical instruments. This new technique became known as frequency modulation synthesis, or FM synthesis, and became embedded first in the Yamaha DX9 and DX7 line of synthesizers, which revolutionized the music industry from the moment of their introduction in 1983. FM synthesis democratized music synthesis. Before FM, synthesizers were expensive, clunky, and hard to control. Creating new sounds took a great deal of time, experimentation, and know-how. But with FM, any musician could obtain a convincing instrumental sound at the touch of a button. Songwriters and composers who could not afford to hire a horn section or an orchestra could now play around with these textures and sounds.

In the early 1980s, a music synthesizer creator named Dave Smith made up a way to represent musical notes. It was called MIDI. Pronounced middy, an acronym for musical instrument digital interface. It created a standard adopted by the music industry for controlling devices, sound cards and synthesizers. MIDI representation of a sound includes values for the note’s pitch, length, and volume. It can also include additional characteristics, such as attack and delay time. His approach conceived of music from a keyboard player’s point of view. MIDI was made of digital patterns that represented keyboard events like “key-down” and “key-up.”

A piano doesn’t know what a note is, it just vibrates when struck. Like a key instrument it could not describe the wavy, transient language that a singer or a saxophone player could produce. It could only describe the black and white mosaic world of the keyboardist. But there was no reason for MIDI to be concerned with the whole of musical expression, since the aim was to connect s synthesizers so that they could have a larger palette of sounds.

In spite of this, MIDI became the standard to represent music in software. MIDI files contain individual instructions for playing each individual note of each individual instrument. So with MIDI it is actually possible to change just one note in a song, or to re-orchestrate an entire song with entirely different instruments. And since each instrument in a MIDI performance is separate from the rest, its easy to isolate individual instruments and study them for educational purposes, or to mute individual instruments in a song so that you can play that part yourself.

Computers can take your ideas and throw them back at as mechanical suits, forcing you to stay within that inflexibility unless you resist with significant force. A number of software programs are available for composing and editing music that conforms to the standard. It has become impractical to change, expand or dispose of all that software and hardware. MIDI has become entrenched, and despite efforts to make it over it, it remains rigid. Digital tools have more impact on the results than previous tools: if you deviate from the kind of music a digital tool was designed to make, the tool becomes difficult to use. For instance, it’s far more common these days for music to have a clockwork-regular beat. This may be largely because some of the most widely used music software becomes awkward to use and can even produce glitches if you vary the tempo much while editing.

Consider the musical note. One of the oldest human-hewn interfaces is a flute that appears to have been made about 75,000 years ago. The flute plays approximately in tune. So it’s safe to say that whoever played it had a notion of what an in-tunes short sharp sounds was. Before MIDI appeared, various ideas about notes were used to notate music, as well as to teach and to analyze it.

The map of reality is not reality. Even the best maps are imperfect. That’s because they are reductions of what they represent. If a map were to represent the territory with perfect fidelity, it would no longer be a reduction and thus would no longer be useful to us.

“The map appears to us more real than the land.”

— D.H. Lawrence

In other words, the description of the thing is not the thing itself. The model is not reality. The abstraction is not the abstracted. This has enormous practical consequences.

In Korzybski’s words:

A.) A map may have a structure similar or dissimilar to the structure of the territory.

B.) Two similar structures have similar ‘logical’ characteristics. Thus, if in a correct map, Dresden is given as between Paris and Warsaw, a similar relation is found in the actual territory.

C.) A map is not the actual territory.

D.) An ideal map would contain the map of the map, the map of the map of the map, etc., endlessly…We may call this characteristic self-reflexiveness.

When you learn to play an instrument, you by necessity learn to move your body in ways that are at least related to the ways of the original players.

Instruments stimulate the senses of touch and motion, across centuries and continents. Certain horns, bagpipes and drums were battle tools, almost weapons . You must gird yourself in order to focus power. The size of the muscle groups you use is connected to the rhythms you’ll play. Lost worlds come alive between your body and an ancient instrument . Other musical instruments are measured to the human body, so that you can play them with minimal motion, approaching a trance.

Jaron Lanier

The design of an instrument reflects the collective body motion of a culture, and in playing an instrument one’s body feels at least a trace of the movement gestalt of a people, something which is otherwise lost. Just as words build bridges of shared symbolic meaning, if you play world music, finding the hidden forms of motion and breathing that are intrinsic to an instrument is more important than playing with abstractions and ideas. Instruments are cultural and time travel machines. What we know of music from before recording is only what could be written down, and that doesn’t include the sense of flow and gesture, the musicality.

If you try to rediscover piano as if it were an exotic instrument or you listen to the player piano music of Conlon Nancarrow, you can experience the body music of a phantom culture, perhaps one that will exist in the future.

It’s easy to forget that the very idea of a digital expression involves a trade-off The physical world is a fundamentally mysterious place and acoustic instruments possess bottomless depth.Not only are the best interfaces of any mastery and expressiveness but instruments show what’s possible. A sensitivity, and a sense of awe, at the mystery that surrounds life is at the heart of both science and art, and instruments with mandatory concepts built in can dull this sensitivity by providing an apparently non-mysterious setting for activity.

In Braudillard. The first stage is a faithful image/copy, where we believe, and it may even be correct, that a sign is a “reflection of a profound reality”. A physical oil painting cannot convey an image created in another medium; it is impossible to make an oil painting look just like an ink drawing, for instance, or vice versa.

INSTRUMENTS AS TECHNOLOGY

Musical instruments have often been the most advanced technologies around, sometimes surpassing the tools of war. Is the Khaen a low-tech or a high-tech instrument? Is it lower-tech now than when it first appeared, thousands of years ago? As the most eloquent machines, instruments predict the future of culture.

https://www.youtube.com/watch?v=Rs9xbpZ6liE&feature=youtu.be

The second stage of simulacrum is perversion of reality, this is where we come to believe the sign to be an unfaithful copy, which masks and takes away or alter the natural qualities of reality as an “evil appearance — it is of the order of maleficence”. Here, signs and images do not faithfully reveal reality to us. A digital sound, or any other kind of digital fragment, is a useful compromise. It captures a certain limited measurement of reality within a standardized system that removes any of the original source’s unique qualities. It will be a flat, mute nothing if you ask something of it that exceeds those expectations. If you didn’t specify the weight it isn’t just weightless.

A real music note is a mystery. The definition of a digital note is based on assumptions of what aspects of it will turn out to be important. A physical object, on the other hand will respond to any experiment a scientist can conceive. What makes something fully real is that it is impossible to represent it to completion. No digital image is really distinct from any other; they can be morphed and mashed up.

The Gu Zchung is one of the Chinese classical harps. Just as the Indian tradition (of Raga) brings us a far more refined sense of tonality than is found in the west, so the Chinese classical tradition of harp playing brings us a deeper awareness of string articulations (types of plucks, vibratos, etc.).

If Gu Zchung players had somehow ended up in Appalachia…

Jaron Lanier

Computers have a tendency to present us with binary choices at every level, not just at the lowest one, where the bits are switching. This is what happened when elements of indigenous cultures were preserved but de-alienated by missionaries. We know a little about what Aztec or Inca music sounded like, for instance, but the bits that were trimmed to make the music fit into the European idea of church song were the alien bits are where the flavor is found. They are the portals to strange philosophies.

Something like missionary reductionism has happened to the digital recording. The strangeness is being leached away by the mush-making process. Protools went further, organizing tracks into multiple-choice identities, while seeking to erase point of view entirely. If a church or government were doing these things, using computers to reduce individual expression it would feel authoritarian, but when technologists are the culprits, we seem hip, fresh, and inventive.

This class of lock-in is technologically hard to overcome if the monopoly is held up by barriers to market that are nontrivial to circumvent, such as patents, secrecy, cryptography or other technical hindrances.

MP3 is now patent-free, in 2001 it was both patented and entrenched, as noted by Richard Stallman in that year (in justifying a liberal license for Ogg Vorbis):

there is […] the danger that people will settle on MP3 format even though it is patented, and we won’t be *allowed* to write free encoders for the most popular format. […] Ordinarily, if someone decides not to use a copylefted program because the license doesn’t please him, that’s his loss not ours. But if he rejects the Ogg/Vorbis code because of the license, and uses MP3 instead, then the problem rebounds on us — because his continued use of MP3 may help MP3 to become and stay entrenched.

Music Companies have used to lock customers in through switching costs with

1- base product trap: Luring customers with a base product and then milk profits from ‘consumables’ you are forced to buy.

2. ‘Data trap’: You create or purchase content and apps that are exclusively hosted on a platform. But, leaving one platform for another forces customers to let go of data or activity that can’t be migrated.

3. ‘Learning Curve Trap’: You can be discouraged when you have to start over and learn how to use a new product.

You can’t just look at the way a technology performs in one moment of time. You have to look at the whole life cycle, including development and maintenance. You can spend thousands of dollars in software plugins like every musician has done, but none of them will still work in a few years. Software becomes obsolete because it depends on perfect conformance to protocols and other aspects of a software ecosystem. Meanwhile, physical music effects pedals and music synthesizer modules that contain computer chips that perform exactly the same functions as software plugins but all of them still work.

The reason is that physical devices have analog, air gap connections that are resistant to obsolescence. In theory, the software plugins should be cheaper, more efficient, better in every sense. In practice, the hardware boxes are cheaper, more efficient, better in every sense, because they still work.

Whether individuals are locked in collectively, in part through each other. Economically, there is a cost to resist the locally dominant choice, a barrier that takes cooperation to overcome. Converting one lossy file format into another incurs a generation loss that reduces quality. This is in effect a switching cost. Therefore, if valuable content is encoded in the format, this creates a need for continued compatibility with it.

FILLING A NICHE

One aspect of information technology is that a particular design will fill a niche and, once implemented it becomes a permanent fixture from then on, even though a better design might just as well have taken its place. An annoyance then explodes into a challenge because computers are growing at an exponential rate, amplifying the consequences of initially inconsequential decisions into unchangeable rules.

MIDI is the lattice on which almost all the popular music you hear is built. Much of the sound around us — the ambient music and audio beeps, the ring-tones and alarms — are conceived in MIDI. The whole of the human auditory experience has become filled with discrete notes that fit in a grid.

Before MIDI, a musical note was a way for a musician to think, or a way to teach and document music. It was a mental tool distinguishable from the music itself. After MIDI, a musical note was no longer just an idea, but a rigid, thought structure solidified.

Science removes ideas from play empirically, for good reason. Lock-in, however, removes design options based on what is easiest to program, what is fashionable, or what is created by chance. Lock-in removes ideas that do not fit into the winning digital representation scheme, but it also reduces or narrows the ideas it immortalizes.

COMMAND LINE INTERFACE

A lot of the locked-in ideas about how software is put together come from an old operating system called UNIX. While MIDI squeezes musical expression through a limiting model of the actions of keys on a musical keyboard. A UNIX program is like a simulation of a person typing quickly using the actions of keys on typewriter-like keyboards. The operating system provides a set of simple tools that each performs a limited, well-defined function.

For example a principle of UNIX is that a program can’t tell if a person hit return or a program did so. UNIX believes in discrete abstract symbols and not on temporal, continuous, non-abstract reality. Timing is suppressed by this particular idea; it is more like a typewriter than a jazz musician. As a result, UNIX is based on events that don’t have to happen at a precise moment in time.

The human organism is based on sensory, cognitive, and motor processes that have to be synchronized precisely in time. UNIX tends to “want” to connect to reality as if reality were a network of fast typists. Rather than developing their own operating systems, a lot of companies licensed the UNIX source code and produced their own derivatives to run on their hardware. If there was one sentence that would resume the UNIX community, it would either be “why do you want to do that” or “did you read the manual”.

Everything Is a File (Unless It Isn’t)

Not everyone thought the idea of the file was so great. The first design for something like the World Wide Web, Ted Nelson’s Xanadu, conceived of one giant, global file, for instance. The first iteration of the Macintosh, which never shipped, didn’t have files. Instead, the whole of a user’s productivity accumulated in one big structure, sort of like a singular personal web page.

The ideas expressed by the file include the notion that human expression comes in severable chunks that can be organized as leaves on an abstract tree — and that the chunks have versions and need to be matched to compatible applications. The idea of the file has become so big that it’s worth trying to notice when philosophies are congealing into locked-in software.

Corresponding philosophies of how humans can express meaning have been so ingrained into the interlocked software designs of the internet that we might never be able to fully get rid of them.

Every element in the system — every computer, every person, every bit — comes to depend on protocols that adherence to a common standard. Commercial interests promoted the widespread adoption of standardized designs like the blog, and these designs encouraged pseudonymity instead of the proud extroversion that characterized the first wave of web culture. Instead of people being treated as the sources of their own creativity, commercial aggregation and abstraction sites presented anonymized fragments of creativity as products that might have fallen from the sky or been dug up from the ground, obscuring the true sources.

JAron Lanier

This leads to a nostalgic malaise of trivial mashups of the culture, and by fandom responding to the dwindling outposts of centralized mass media. It is a culture of reaction without action. The deep meaning of music is being reduced by illusions of bits. If you try to pretend to be certain that there’s no mystery in something like consciousness, the mystery that is there can pop out elsewhere in an inconvenient way. There’s a way that consciousness and time are bound together. If you try to remove any potential hint of mysteriousness from consciousness, you end up mystifying time in an absurd way.

Humans are free. Maybe if people pretend they are not conscious or do not have free will then we might be able to collectively achieve antimagic. We can engineer our genes to better support the click track and make culture into second-rate activities and spend centuries remixing the detritus of the 1960s and other eras from before individual creativity went out of fashion.

MUSIC AS POST SYMBOLIC COMMUNICATION

Music is a “consciousness-noticing machine.” It holds the promise of a new, and fundamentally different mode of post- symbolic communication — I just Am, i think. In other words, music with the ability to time-bind by passing information from generation to generation, as humans can.

By reorienting the perceptual processes of our brains through music, we could experience a different relationship to our contexts … Lanier calls this state “fluid concreteness.” Language is the thing people can do to use a tiny part of physical reality that we have the power to manipulate at the speed of thought . Language is what they call a “hack” in Silicon Valley culture.

A symbol is a trick for the sake of efficiency. It lets the brain express thoughts to others about as fast as they are experienced, without all the work of realizing changes to physical reality. Symbolism turns the part of the universe we can control, like the tongue, into an invoker of the rest of the universe, and all possible universes, that we cannot control in haste. N

Jaron Lanier

What is communication without the symbol — without the ambiguity of interpretation? if words were too specific, they would be useless. one concept is necessarily used to refer to another.

Wouldn’t they be abstract or Platonic references the shapes of things to come? or access to the “Umwelt”Are they really that different from words?” Even so, it is probably that postsymbolic communication will be distinct from whatever it is that came before. Consider “blue.” A scientist might describe it as a frequency of light that best matches a class of sensors in the retina. It is said that people in ancient times didn’t even notice the existence of the color blue until there was a word for it. It is absent from much of ancient literature. How can we not wonder what we might be missing today? Maybe post-symbolic communication will open our perception wider than words.

The tongue and the fingers! Fingers can play notes on a piano about as quickly as a pianist can think of them. Listen to an Art Tatum solo. The fastest piano players are spinning inventive improvisations as fast as people can hear. We have used hands to create every artificial thing we have know.

Information theory provides the conceptual framework for our modern communication technologies. the central set of conventions — the design protocol — to manage a flow of information in a computer program is still based in the metaphor of the telegraph. Systems created by humans guided by this principle become unwieldy when tasked with a very large set of information that changes in real-time. A small glitch can cause terminal program failure.

BioloGicAl systems, by comparison, are both exponentially more complex and more resilient. Neural processing is not yet well understood, but it is evidently a multichannel operation, and the system itself exhibits a high degree of plasticity. The paths information follows as it travels through the brain, and the brain’s capacity to adapt new paths in relation to its changing environment, have no place in a mechanistic model.

Instead of thinking of information as a single bit traveling from A to B, lanier conceives of an ever-changing suRFAce from which multiple points are sampled simultaneously and continuously. This will be a new trick in the repertory of the species. The same parts of your body that were used to make language possible will be leveraged to make the stuff of experience, not symbolic references to hypothetical experiences. True, it will take years to learn how to play things into existence, just as it takes years to learn to speak a language or play the piano. But the payoff will be tangible.

This experiences point the way to a better ramp of progress. A survivable ramp. Lanier calls this better ramp the McLuhan ramp. Consider that people have been innovating ways of connecting with each other since the dawn of the species. From spoken language tens of thousands of years ago, to written language thousands of years ago, to printed language hundreds of years ago, to photography, recording, cinema, computing, networking; then to virtual reality, and eventually to what I hoped this post might provide a glimpse of: postsymbolic communication — and then on to what I cannot imagine.

A McLuhan ramp is made of inventions, but the inventions don’t just achieve practical tasks; they foster new dimensions of personhood — potentially even empathy. The philosophical notion of post-symbolic communication; the engineering project of phenotropic architectures — these are attempts to take tiny steps up on the McLuhan ramp.” In addition to exploring distant star systems, we might also imagine that in the future we’ll find ways to know each other better. Since we’re fundamentally creative, that process would never end.

The phrase “Music is sex with God” was never intended as a provocation.

--

--

Responses (1)