[linux-audio-dev] Developing a music editor/sequencer

NadaSpam NadaSpam at adelphia.net
Sun Jan 30 02:24:58 UTC 2005


Sorry if multiple copies of this appear. The spam filter doesn't like my 
choice of titles. I've tried a few variations so far.

I'm looking to develop a music editor/sequencer somewhat in the vein of 
Cakewalk/Rosegarden, but looking more towards the future of MIDI and audio 
capabilities. I've been thinking about this for a long time, and I think I 
have enough of a plan now to make a go at it.

This is a rather long post, so I've divided it into four parts: Why Not 
Rosegarden?,  Project Overview, Design Goals, and End Notes.


Why Not Rosegarden?
Why not just join the Rosegarden devlopment team? While Rosegarden has a lot 
of promise, it's design goal is different than what I'm thinking of. To put 
it crudely, Rosegarden's goal is to be a better Cakewalk. This isn't meant to 
be disrespectful to the Rosegarden developers. I like Cakewalk. (If you 
don't, substitute CueBase, or whatever professional tool of that genre.)

Music is a language. While music itself is much more complex and vague (like 
natural language) than a programming language such as C, I will use the 
analogy of a programming editor. Rosegarden (and all similar tools) is like a 
C editor. When it's mature, it will be a very good C editor. In addition, the 
modularity and generalized approach that Rosegarden uses will make it 
possible to also be a good C++ editor. I'm thinking of something more like a 
multi-editor that will work for Perl or lisp. It will have it's own 
limitations, so I'm not trying to create the be-all end-all of music editors. 
But there are some fundamental things that Rosegarden will never be able to 
accomplish. (An most users won't have any reason to want an editor that will 
accomplish these things.)

The first involves the way meter and tempo are handled. There are several 
pieces of 20th century music that incorporate 2 or more voices playing 
simultaneously in 2 or more meters. An example that chokes every editor I 
know of (though I'm told lilypond can do it with difficulty, but it has no 
sequencing capability and isn't a WYSIWYG) is Bartok's "Music for Strings, 
Percussion, and Celeste". I'm not saying that you can't create a Rosegarden 
file that will play this piece. What I am saying is that the notation will be 
extremely ugly. The issue is that not all of the voices are playing 
simultaneously in the same meter. To notate this in all in one meter means 
that some of the voices will look extremely complicated with myriads of ties 
and accent marks.

Another interesting thing is to have 2 voices playing with different tempo. 
No examples come to mind, but I can think of some cool effects that could be 
achieve by having one voice race ahead of another. Again, Rosegarden could 
play such a piece, but the notation would be ugly.

A third thing revolves around scales and tunings. Rosegarden has some plans 
for different tunings and perhaps support for quarter tones. And I think, 
again, that a Rosegarden tuning would be applied to an entire composition, 
not a single instrument. But I'm thinking of things more general. Suppose 
each track had an element called "tuning". Suppose also that each note is 
stored as an integer in a note event (as it is in Rosegarden). The tunings 
element contains a mapping between the note number given in the event and 
what is actually played. Microtones and temperaments could be stored in the 
tunings file as a MIDI note number plus a pitch bend. (Or we could store 
frequencies, and some function converts frequencies to MIDI note numbers plus 
pitch bends.) More novel things (such as octaves that aren't quite perfect 
frequency multiples) could also be accommodated. The tunings file could also 
be used to define scales that include added scale degrees. (e.g. for a 
Pythagorean tuning, we have to include extra values, so we can notate, say, 
an F## instead of a G, as these tones may not be equivalent in a non-equal 
temperament.)

Rosegarden is trying to be fairly general about instrument definitions, which 
is good. I'm thinking of somewhat different approach that looks to the future 
of MIDI, in which we want to achieve more realism of the performance. Each 
Track has this abstract thing called an Instrument. Instruments may be atomic 
(a string) or composite (a guitar, which is composed of several atomic string 
instruments). Let's suppose we use soundfonts. We could create a soundfont 
with banks 1-6 for the strings of a guitar. Each single-string Instrument 
gets mapped to one of the tone banks, and the guitar Instrument is composed 
of these single-string Instruments. 

In this way, it would be possible to achieve a playback in which an open D 
chord sounds different than a 5th position D chord played a string lower, 
just like on a real guitar. Instruments should have "modes" which can be 
selected (and changed within the song). A guitar might have pickstyle 
(perhaps which eventually get mapped to tonebanks 1-6) and fingerstyle (which 
eventually get mapped to tonebanks 7-12). An organ could have modes  for 
various configurations of the stops. We could get very elaborate here and 
have organs with multiple keyboards, etc., but I'm falling into a dreamy 
state here. The idea is that the design should allow for this type of 
functionality to be added.

So, how can we accomplish such a beast? Let's take a tour of what's in my 
brain. No, no! Not over there! Don't open that door. That's the basement 
where all the beastly details hide. Whatcha mean it's empty? Of course it's 
empty. No, not that door either. That's the closet of confusion and 
inconsistency. This way, this way. Here's the beautiful entry...


Project Overview
No pretty pictures, so you'll just have to imagine.

We begin with a main window. The document-view model is appropriate. We'll 
have the typical views that you'd expect -- TrackView, ScoreView (StaffView), 
MixerView, and some sort of EventView.

TrackView should look similar to that of Rosegarden, but we need to implement 
it differently. The window needs to be some sort of split window, in which 
each track gets its own pane. At the top, should be a time ruler which can 
display in seconds, SMPTE, or beats/measures of a track. It should default to 
displaying beats/measures of track 1. Each track pane displays its clips 
similar to Rosegarden/Cakewalk. But it should hold a TrackGroup instead of a 
single Track (Byt default each Track is in its own group). Additionally, it 
should have a hideable ruler that displays time (default to beats/measures) 
and tempo markings for that TrackGroup. This means that the TrackGroup data 
structure needs to hold info about its meter and tempo. TrackGroups can be 
expanded to show each track. They can also be hidden to reduce clutter. 
TrackGroups can be nested (e.g. strings, violins, violin1). During playback, 
big vertical ruler bar should follow the sequencer, similar to the appearance 
in Rosegarden.

ScoreView should display like a score with staff notation. Tracks can be 
grouped, and it should be possible to hide groupings. Expanded track 
groupings should show each track. When compressed, a single staff with 
multiple voices (1 for each track) should be shown. In this way, 4-part 
harmony, drums, etc. can be displayed as is most convient to the user.

Tracks should be displayed with appropriate staves (single 5-line staff for 
voice, optional tablature for stringed instruments, etc.) Non-MIDI tracks 
such as wav audio should be displayed as a simple line or bar. We should 
include typical musical symbols, dynamics, guitar tab symbols, lyrics, 
section markers, and effects markers (things like echo, flange, etc.) Each 
track should have an option to hide various things (e.g. hide the tablature, 
lyrics, or effects markers).

Eventually, we should be able to print out sheet music.

The MixerView should be pretty typical, but again, with the ability to create 
a nesting of sub-mixers. Each track should get its own effects rack. 
Additionally, each sub-mixer (or track group) should get an effects rack. 
This way, we can set effects for individual instruments. Then we can layer 
another level of effects (say panning or delay) to a group. Finally, we can 
add a layer for everybody (say reverb). As much as possible, we shouldn't 
care whether we're dealing with wavs or MIDI. If we have a MIDI track, the 
popup menu should only display MIDI effects. For aggregate groupings, only 
effects that can be used on either MID or wave should be available. (Perhaps 
we can take existing MIDI or audio only effects and put wrappers around them 
that select which effect is actually used.) Supporting plug-in effects would 
be very good. In the end, though, I think effects should be handled in the 
same way as instruments. That is, we build complex effects out of simpler 
components.

This leads me to instruments. Instruments and effects should be stored in 
definition files. Then we can set up our instruments once and use them in 
multiple compositions. Ditto for tunings.

Files should be stored as XML (probably compressed, as Rosegarden does).


Design Goals
1) Flexibility is paramount. A rigid desisn isn't going to work here. There 
are too many unknowns and open-ended issues.

2) Modularity. Keep the  pieces as independent as possible. Try to make 
generic base classes that defer as much of the details as possible to derived 
classes.

3) Begin with a generic framework. Determine how the system should operate. 
Do we run the whole thing like an IDE for a programming language? How 
separated should the editor and sequencer be? Do we edit, save to file, and 
compile (sequence)? Or should the sequencer be more closely coupled, causing 
the data structures in memory to be shared between  editor and sequencer? The 
first approach would make it difficult (or impossible) to do realtime mixing 
of MIDI tracks, but it would provide for very efficient playback. All of the 
MIDI tracks would be compiled to a MIDI file, so the on-the-fly processing 
would be minimal. Efficient, but limiting. Perhaps it should run more like an 
interpreter, processing the file on demand. A "mixdown" option could be 
added, that would compile everything down for a final release. Then anyone 
with the sequencer could play the file. (Analoguous to Windows Media Player, 
in which you can play .avi files, but you can't create or edit them.)

4) Simple things should be simple. The default should be that every track 
plays at the same tempo and in the same meter, instruments should map to 
General MIDI bank, and equal-temperament tuning with standard key signatures 
should be used. Recording, playback, and basic editing should be easy and 
intuitive. The most common features and feature groupings should appear in 
the menus. An advanced tab should be present to allow for lesser uses 
features.

5) Don't lose sight of the big picture. The end goal is to have a product 
that allows for audio and MIDI tracks that will play in sync. Other types of 
tracks (such as video, or images) could be included as well.

6) Get something that's useful and provides basic functionality up and 
running quickly. Add features step by step. A brilliantly conceived program 
isn't worth much if it doesn't run.

7) I'm leaning toward C++ as the primary language, though it may be better to 
use a scripting language such as Perl (which I know) or Python (which I don't 
know) for certain portions. My most common methodology involves writing a 
core engine in C++, then developing a simple scripting language (usually in 
Perl) to access the features of the core. I'm not sure such an approach is 
appropriate here, though.



End Notes for the Curious
I used to be a software designer and programmer. I've been stagnant for 
several years now due to an illness that's taken away my presenability in job 
interviews. (It's hard to make a good impression in today's world of 
corporate fluff where appearance is more important than ability when you look 
haggard, shake like an addict in detox, and have periods of brain fog that 
never fail to appear in an interview.)

I don't know all that much about the down and dirty of audio programming. I 
wrote a little media player tool with DirectX in Windows, a linux tool that 
converts files between MIDI and XML (I should probably stick this out there 
one SourceForge, as someone else might think it useful. How do I do that?), 
and a few other little audio utilities.

I know a good deal about software design and maintenance, and I'm proficient 
in c++ (and a few other languages as well). I've been project leader on a few 
different pieces of software of substantial (100k-200k lines) size, with 
teams of 10 people of less.

My degree is in applied mathematics.

For any university professors out there, I think this would be a great 
master's (or phd) thesis. And I just might know a pretty good candidate.  ;)




More information about the Linux-audio-dev mailing list