also visit: Theatre of Noise | Soundings

about: this site | me

subscribe: RSS

01 April 2009

So You Want To Learn Python?

First, let me congratulate you on wanting to give Python a try! I have used many languages but always return to Python as it is the easiest to read, the fastest to programme, works on any platform and is bug-free. It's very much like an easier Java, in the same way that Java is an easier C++. There are many tutorials and books available to ease you into the language. Read on and I'll recommend my favourites.

Learning Python depends on what other programming experience you have. If you are already comfortable with coding, start with Instant Python, a one-page crash course. Bit-heads can manage with that and the official docs.

For those of us without a computer science degree, I recommend the thorough tutorial Think Python. This is a free book you can also buy on paper as Python for Software Design.

Of the made-from-trees books that are generally available, I have had great success teaching using Learning Python (Third Edition).

Once you get more into the language, the Python Cookbook (Second Edition) is an invaluable source of problems and solutions. I would say this even if I wasn't one of the contributors! The Python recipes at ActiveState are another good source of solutions, but rather more mixed in quality.

P.S. How many languages have I used? I can recall Basic (loaded from cassette tape!), Fortran (punch cards!!), Perl, Visual Basic, Pascal, Java, C, C++ and 8086 assembler. Or rather, I cannot recall them, which helps me sleep sounder at night

23 October 2008

Csound Revisited

From my silence here you might imagine I've been doing no programming at all. Nothing could be further from the truth. While it is true that my commercial and open source projects are pretty well at a standstill, I have been working with a variety of languages for music composition and processing, so you can soon expect some sort of follow-up to the articles I wrote on music control languages, Chuck, etc.

How soon? Well, maybe I shouldn't have said anything at all! I am deep in a Masters in Music Technology at the University of Limerick, the one run by the Roman Numeral Department (as only I call it). Properly it's the CCMCM, The Centre for Computational Musicology and Computer Music. Here I am being forced to confront the wonders of Java, Max/MSP and Csound.

It is the last of these that has taken me by surprise. I had previously pretty well dismissed this ancient language. Certainly it is a product of some bygone age of punch cards and batch processing. But it is not quite the relic I had assumed. Once I realised that I could a) whip up a quick interface in FLTK, b) call Python routines, c) call Csound from Python and d) create realtime music with interactive control (instead of mere algorithmic compositions) I became a lot more interested.

The first fruits of this dalliance will be premiered in Mainz, Germany this Saturday. Combining audio file manipulation, realtime MIDI control and a pretty OK interface, "The Absence Of Baudrillard" allows me to interact with pseudo-random events in the same improvisational manner I've been exploring for the last three years (using Reaktor). But the sounds are quite different.

Read about the thoughts behind this piece at The Theatre Of Noise.

27 September 2007

Typefaces On The Web

I am sure we all have our favourite typefaces and others that make us cringe in horror. Me, I like Futura; it's useful in (nearly) all its variants and stays out of the way when you want it to. I'm also rather mad for Fette Engschrift (aka DIN 1451).

In fact I'm the type of idiot who buys typefaces, though I don't think I have bought more than one set in the last few years. Problem is, eventually I have to design a web page, and then all my love for typefaces goes to waste.

That's because only fonts installed on the client computer will be viewable in their browser. Some time ago I decided to look into this sad state of affairs. Just now I tidied up the document and posted it for your edification. Browser Font Samples is over on one of my other websites since I didn't want to adapt the formatting for this blog.

What we need is a technology like that provided by PDF documents, namely, embedded fonts. I remember the days before PDFs, when fonts had to be specified by number and sending a document from one platform to another was a nightmare. If you don't know what I'm talking about you missed a great deal of hell. If you do know what I mean I need only mention one sordid word to make you shiver with horror: Palatino.

But anyway, wouldn't it be nice to embed fonts on web pages? Well, you can, in a manner of speaking.

The simplest way is simply to render the letter shapes you want as images and insert those on the web page. Though a time-tested technique, this has the disadvantage of requiring extra work, plus it constrains the design process -- what to do if the letters are being generated from a dynamic publishing system? If you're not careful with alt attributes you might also reduce the accessibility of the page or its friendliness to search engines. Over-use of images will also decrease the page load time, again, unless you are careful with image optimisation techniques.

And of course every time you decide to change the point size, colour or other attribute it's back to your graphics app.

Sometime in late 2003 a bunch of clever folk came up with sIFR, which just happens to stand for the Scalable Inman Flash Replacement (clever they were but not very good with branding). The technology is based on the fact that most modern browsers are compatible with Javascript and Flash. Simply, a script scans your page for text you've asked it to render. For each area of text it creates a Flash movie of the same size, containing the self-same text but in the font you've asked for.

This can look very nice, as the demo illustrates. The rendering is fast, overhead is small and it degrades properly. Clients without Javascript turned on or without Flash installed will see the original text.

But it does require that you use Flash and post SWF files to your website. Over-used it can appreciably slow down the browsing experience. And it does sacrifice some usability... don't put links in sIFR block please! If all this still sounds good to you and you don't need backwards compatibility with older browsers you can always try the version 3 beta. Just be sure you have $700 on hand for a copy of Flash CS3 Professional.

Prefer a cheaper solution? If you have server-side access you can use Python etc. to render text using whatever imaging library you can get your hands on. That's the sort of solution I'll be looking at for my own requirements.

Anything to save me from a life of Georgia and Tahoma.

04 September 2007

ChucK Language Features And Limitations

segno
In this second look at ChucK I focus on the language features that makes this system unique and outrageously cool for realtime music production. I also run through the major limitations.

Working with the ChucK language might take some adjustments, depending on what you are used to. It has control structures, strict typing, associative and multi-dimensional arrays, most of which will look familiar enough to any programmer. But a couple of things are strange. To start off with, operators are left-associative. Second, a lot is done with the ChucK operator, including connecting audio components (called Unit Generators or ugens). In this case the operator acts like a patch cable. For example:

Impulse imp => JCRev rev => dac;

This connects an impulse generator to a reverberation unit and then out to the DAC, which abstracts a sound card output. Neat, huh?

The same operator is used for simple assignment. For example, to change some properties:

.95 => rev.gain;
.2 => rev.mix;


This gives us a 20% reverb mix and reduces the gain to 95%.

This operator can also be used to call functions. One might think to use an assignment to capture the result of a function, and this in fact works:

addthese(3, 6) => int result;

But so does the more peculiar:

(3, 6) => addthese => int result;

(And oh yes, semi-colons go at the end of every line and braces must be used to match up indent levels. I hate this after living in the simpler, cleaner Python world. Oh well.)

The coolest thing about ChucK is how it manages concurrency. Each patch (or function) may be run in its own process (called a "shred"). The process of starting one of these is called "sporking". (Did these guys watch too much Mork and Mindy as kids?) Each process refers to local time to make events happen. Then -- and this is the miracle -- all running processes are automatically synchronised at the sample level!

The canonical example is the command chuck foo.ck bar.ck which runs the two shreds "foo" and "bar" concurrently. Need to record the output from these? Simply add in the shred "rec" like so: chuck foo.ck bar.ck rec.ck. Remarkably, the file "rec" is about 10 lines of code that does not refer to the other shreds at all.

The other major innovation is that ChucK is a strongly-timed language. Two native types, "time" and "dur" -- duration are available in convenient units and may be operated upon in a natural fashion. So for example, n should be 50 after this statement:

1::second / 20::ms => float n;

The real magic is due to the explicit handling of the current time using the "now" keyword. The following snippet advances time by one second. The current shred is suspended, audio is generated, and any other shreds keep computing.

1::second => now;

ChucK supports events; keyboard, mouse and joystick support; plus MIDI and OSC communications. It comes with some useful built-in libraries and ugens such as Impulse, Step, Noise, filters (one-pole, two-pole, low-pass, high-pass, band-pass, band-reject, resonance and custom), oscillators (sine, ramp, pulse, square, triangle, saw), panner, mixdown, envelopes, effects (delay, three implementations of rev, chorus, modulation) and loads of instrument simulations (everything from sitar to percussion). It can read and write audio files, and a sparsely documented new LiSa feature supports live sample manipulation for granular synthesis and looping effects.

This leaves very little outside Chuck's scope, though there are definitive limitations in the current implementation.

1. There is no serial I/O functionality, meaning that ChucK cannot be used to talk to boards such as the Arduino.

2. There is no file I/O for reading/writing settings and other data. This also means that custom log files and the like are not possible.

3. Namespace support is limited, so it is not easy to organise code into libraries for reuse. Each file can have only one public class. Files do not have explicit namespaces, though one is created if a file is sporked as a shred. But this is not addressable, it merely keeps private data in each process from colliding.

4. Multiple sound card outputs (eg: more than 2) are supposed to be addressable, but I have not got this to work. ChucK needs a command-line argument to enable this. My problem 'twas but a small syntax error in this argument.

5. Sound file writing is limited to mono files, so in order to get stereo two files have to be written and later combined outside ChucK. See this example.

6. The language contains (almost) no string manipulation functions. In fact strings are barely mentioned in the manual.

7. Garbage collection is not implemented. This means that sporking files eats memory. The workaround is to re-use esisting processes, but this is obviously more work than would be nice.

8. The built-in ugen class cannot be subclassed, so there is no way to inherit behaviour for one's own generators. It is possible to rewrite some of this is a class of one's own devising, but that's redundant. I posted an example of this here.

9. The architecture is not as powerful as the client-server paradigm used in SuperCollider.

I look forward to future versions which remove these limitations, starting with number 1. In the meantime ChucK is fun to fool around with.

I thank the Arts Council for their support in this research.

02 September 2007

Getting Started With ChucK

segno
In my last music technology article I concluded that there were no suitable Python tools for real-time composition and audio processing. In a previous look at specialised music control languages I discovered that some of these offer novel paradigms that might be particularly useful for realtime composition.

As a result, I decided to investigate ChucK. Back in spring 2007, when I initially did this research (sorry to be so tardy writing about it all!) I decided it was yet too immature to warrant deep study. But improvements have been steady, and as of 30 August were dramatically improved with the ASIO patch to version 1.2.1.0.

After downloading this tweaked ChucK executable I tried it with a simple example patch file from the electro-music forum. Wham! Sound! It worked, first time. Amazing OOBE*. Then I tried tweaking it and all sorts of things happened, most of them nice.

I read through some of the list archives and started wondering where all the example files were. Going to the download page I saw that there were two Windows options: "standard command line executable" or miniAudicle , the "experimental integrated editor/VM". So I grabbed these to see what was in them. Both include the docs and examples, not only the components mentioned. Not so clear from the labeling, but maybe obvious in hindsight.

Trying the horrifically named miniAudicle I had a few issues, none of them major. First, it took an awful long time to load. I had thought maybe it was completely broken. Some sort of "loading..." message would reassure.

Next, it detected my RME Fireface drivers but once I changed to these using the preferences dialogue I needed to restart the app. Reasonable perhaps, but annoying considering the aforementioned load time.

Playing back one of the examples I had used before, everything sounded wrong. I determined this was because the sampling rate was very high (192 KHz). I am still not sure why this would have screwed up the sounds, as they were Shakers. Perhaps these are based on samples?

Anyway, setting the rate to 32KHz made things better, but I wonder why these are the only two choices when most audio apps permit standard 44.1KHz, 48KHz etc. Maybe something to do with choosing only the base rates from the sound card.

Editing was nice in the coloured syntax window, especially once I'd set tabs and the font to my liking. I am glad there's a Recent Files list. However, implementing file drag'n'drop from Explorer to the IDE would be a big usability bonus.

The extended keyboard keys do not seem to be supported. This is particularly annoying for my use, since I have a mini keyboard that makes it difficult to use the default Del key (for example).

I started thinking about using miniAudicle in performance. Live coding is one of ChucK's strengths.

The interface lets you start up a virtual ChucK machine and then add files to it individually as Shreds. This is very handy but some sort of global controls for mixing are necessary. Otherwise, the only way to stop a shred (take it out of the mix) is to kill it, which happens abruptly and not always too nicely (in sonic terms). Perhaps the implicit mixer before the DAC could be exposed, along with level and mutes on each channel/shred.

A solution to this might very well be forthcoming on the ChucK mailing list, which has some active and helpful participants.

A further usability issue was revealed when I tried to use a patch from the examples folder that required sound files. These were referred to by relative pathing, but this only works if the script is run in its own folder. It appears there is no way to programmatically specify the root folder. So how does one integrate patches like this into a miniAudicle session? Apparently this is a known issue. And a pretty big one that needs an immediate fix, IMO.

A simple solution would be if a list of paths could be specified in the preferences. The IDE could search these first looking for sound files.

Other deficiencies are evident from the documentation, which is straightforward, honest and comical. Garbage collection is not implemented, so long-running pieces with lots of spawned processes might run out of memory. Some of the sound generators are not yet documented.

But overall I think ChucK is mature enough for anyone interested in experimenting with concurrent audio programming to give it a whirl.

In a follow-on article I'll look at the programming constructs and highlight what makes ChucK so unique.

*Out Of Box Experience

I thank the Arts Council for their support in this research.

31 August 2007

Music Control Tools: Python-Based

segno
In the previous part of this series of articles I looked at dedicated programming languages for music creation. But why invent a new language for this one special domain? Surely it makes more sense to use one of the many existing languages, provide libraries for required protocols and interface with the correct hardware?

Well, that is not necessarily true. As we saw in our short look at ChucK, new paradigms of coding ("strictly-timed") may enhance development in a specific domain.

However, the advantages of adopting an existing language are obvious. The developer need not work out their own grammar, syntax and tools, but can rather concentrate on the actual audio part of the problem.

In this article I will look at audio development tools based in my language of choice, Python. These will all have the advantages of that language: clear syntax, pragmatic mix of functional, object-oriented and imperative models, strong OOP implementation, etc.

PyGame, based on the Simple DirectMedia Layer (SDL), has lots of graphics functionality, but also plays back MPG files and accesses devices like CD-ROM drives and joysticks. It can queue and play music streams, and contains sound sample manipulation, but is not robust enough in this regard for our purposes. For instance, it can handle only one audio stream at a time and has no DSP functionality.

pySonic is a wrapper for the FMOD library and provides sound file playback and recording, MIDI, 3D sound and support for many formats (wav, aiff, mp3, ogg, mod, etc.). FMOD is not free or open source, so one would have to abide by their license restrictions.

PyPortMidi provides MIDI I/O, although the latest binary is for only Python 2.4. Likewise here's a module for OpenSoundControl (OSC) client functionality.

PySndObj is a wrapper for the SndObj Sound Object Library which provides realtime audio IO and MIDI input (though not on all platforms). Currently this is too immature to recommend for the task at hand.

PyMedia is a library for sound file playback and recording which supports wav, mp3, ogg, avi, divx, dvd, cdda etc. It also has some DSP functionality, namely resampling and frequency analysis.

athenaCL is an interactive command line program specifically designed for algorithmic composition and pitch models studies. It has a host of tools in this regard, outputting results to Csound, MIDI, audio file, XML and text formats. While not requiring Csound, it is tightly integrated with that programme, containing Csound format instrument. (Without it, results can nonetheless be rendered to MIDI files.) This programme looks excellent within the domain it is targeting.

Finally, MusicKit includes serial and MIDI I/O, scheduling and synchronization, real-time (or non-realtime) synthesis (FM, wavetable, physical modeling etc.) and DSP. It supports quadraphonic sound, MP3 and Ogg/Vorbis plus multiple inputs and outputs. Furthermore, the package comes with high-end tools including a sampler, sequencer and score player.

Of all the packages here, it potentially provides the most robust toolkit for sound creation and manipulation. Unfortunately, not only is the GUI on Windows incomplete, but so is MIDI support and the DSP functions. Critically, MIDI and DSP are also missing from the LINUX version. (MusicKit had its genesis on the NeXT and so works just fine in OpenStep... in case that helps you.) There has been no version since May 2005, so it is fair to say this tool is ripe for salvage.

The conclusion of this article must therefore be that there is no Python-based system that meets our needs at present. And while it would be possible to scrape one together from bits and pieces of the packages listed here, that would be far more work than could be justified.

Reference: The wiki entry PythonInMusic contains tools I have not covered here, such as those for media playback and cataloging.

I thank the Arts Council for their support in this research.

30 August 2007

Music Control Languages: Specialised Tools

segno
In the first article in this series, I set the criteria for my examination of audio control languages. In this second part I examine the specialised tools available for the purpose. The series will then be concluded with a look at Python-specific solutions.

Though the last few items are subjective and optional, all products discussed meet the first four criteria unless otherwise mentioned:
* open source
* free of charge
* cross-platform (Mac, Windows, LINUX)
* programmatic control with OOP paradigm
* mature implementation
* efficient resource handling
* powerful, expressive, clear syntax
* strong user community

A good amount of software has emanated from IRCAM in France, possibly the foremost institution of electro-acoustic music in the world. Everyone from Pierre Boulez to Aphex Twin has been shaped by IRCAM. So it makes sense to start our examination of software with their graphical audio software Max/MSP. This is popular, but not free. jMax can also be quickly ruled out, since it is a non-OOP visual programming environment for building interactive real-time music applications. Though it is free and open source, the Windows version is only a beta and might not be robust enough for critical use.

OpenMusic is another IRCAM package, also graphical, running only on Macintosh (Linux port is on the way). I include it here for completeness only.

Pure Data (PD) gets the thumbs up from its users. It too derives from IRCAM work on Max. PD may be used for audio and MIDI processing, but also graphics and video (it has a fast dedicated openGL library, GEM, for this purpose). The forum is active and there are many resources on the web, though the home page does link to much dead content.

However, PD is a "patcher" programming language, in which one graphically links together blocks that perform different functions. It is not object-oriented but is important enough to consider, especially as regards the large number of available extensions. For example, RTC-lib, the Real Time Composition Library, provides high-level compositional algorithms. There are also mature interfaces to devices such as the Arduino.

So how do PD and Max/MXP differ? The latter has better documentation, more bundled extensions and a smoother usability experience. Plus it has some programmatic niceties, at least according to those who know it well. For example, the order of processing is from left to right in the Max flowchart, whereas in PD it depends on when you created the items. That is, items you added to the flowchart first when building will execute first when running! That makes zero sense.

Moving across the water we next look at Processing, a programming language which sprang from the MIT Media Lab. Designed specifically for interactive application it is written on top of Java, and has the distinct advantage of being able to run directly in a web browser. There is a good amount of support on sites like Processing Hacks. A MIDI library is available, as is the Sonia Library for advanced capabilities like multiple sample playback, realtime sound synthesis and analysis.

However Processing is still in beta, and this shows in its inefficient implementation. According to this thread it's not quite ready for prime time in its performance and memory handling. But if your emphasis is on visually interactive applications (perhaps for your mobile phone) you should give it a look.

SuperCollider is an interpreted object-oriented language that has two components: a client which communicates with a realtime sound synthesis server via OSC. This architecture is flexible enough to permit "live coding", changing code in real-time as a performance tool. Code objects and process declarations can also be shared and modified over a network.

Because SuperCollider has support for MIDI and serial port I/O (LINUX only), it can be used to control, or controlled by, pretty well any hardware device. So you can scratch with your Wii remote, should that be your thing.

The Windows version, known as Psycollider, is in beta. The fact that the last version update was over a year ago is not encouraging.

ChucK is a "strongly-timed" audio language optimised for real-time synthesis, composition, performance and analysis. It supports MIDI, OSC and multi-channel audio through a command-line or Integrated Development Environment (IDE). Live coding is supported.

Several of the programming paradigms in ChucK are novel. Unlike other audio systems that have a separate audio signal and control signal, ChucK has a unified timing mechanism. One of the built-in types is the "shred", a non-preemptive process with its own namespace. (However, other than this and the public space, there is no other control for namespaces.) Shreds can be sporked (I kid you not) which essentially means instantiated in the virtual machine. Shreds are automatically synchronized and may be scheduled (though ChucK calls it "shreduled" -- silly rabbits!).

Garbage collection is by reference counting, but this is still being implemented. In fact in many places in the documentation there's the phrase "not completely implemented" or even "will leak memory". The last version was 23 August 2007. With further active development I hope ChucK will be ready for prime time soon.

Csound is the grand-daddy of computer programming languages, its genesis reaching back before the nineties to Music360 in 1957. There are vast resources available, but Csound is closer to a lifestyle than a programme, requiring a dense incremental development cycle. The language itself is limited, a fact acknowledged by the Csound community, which has responded by embedding a Python interpreter. This creates a somewhat complicated schizoid syntax for coding.

Python can also be used on the "outside" of Csound, to manipulate CSound files, speeding the creation of sounds. The article Sound synthesis with Csound contains the required module.

A number of front-ends have been developed to ease development. For example, Cabel provides a GUI with Python script control.

While I am sure that a given collection of tools might make work in Csound highly efficient, determining what that might be is the work of months. Unless one's full-time job is computational music, it's difficult to recommend Csound. (I can see the flames igniting from here.)

So, conclusions?

* Pure Data is a rough-hewn many-faceted system that is the best of the free graphical solutions. If you want to spend money get Max/MXP instead.

* SuperCollider has a solid open architecture that suits networked and multi-user solutions, so long as one is willing to forget Windows.

* ChucK is still a work in progress but its "strongly-timed" declarative syntax is an innovation. As is some of the nomenclature! Watch this one.

* I will use Processing to hack my phone into a musical instrument, but not use it for mission-critical work as of yet.

* Csound is what you should use once you get employed by a university to make sounds.

Addendum: Sometime after I first wrote this document I came across the Wikipedia "Comparison of audio synthesis environments" here. An exhaustive list of software for algorithmic composition is located at algorithm.net. This includes many products not on my list but none that meet my criteria.

I thank the Arts Council for their support in this research.