Ideas for new sample libraries

Post by **sbenno** » Tue Feb 05, 2008 10:46 pm

yes, a general midi compatible drumkit makes sense.
many users would like to play General MIDI files in high quality.
The default MIDI synthesizers on all operating systems are quite low quality:
- Microsoft Wavetable Synth (Windows)
- Quicktime MIDI synth (Windows, OS X)
- timidity (Linux)

all three sound like a tin can, no comparison with the quality of 13 year old Roland Sound Canvas
(which only had 8MB of sample ROM and 64 voices, some instruments used more than than one voice so actual polyphony
was lower, yet GM MIDI files sounded great).

I know that Roland engineers did an exceptional work of squeezing so much sounds in mere 8MB (involved lots of looping and FX
juggling) but nowadays given a good and large disk streamed GM sample set it should be possible to achieve much better
general MIDI playback than 13 years ago.

LS + a decent GM sample set could make this a reality. The question is only if the community will be able to produce a good GM set within
a reasonable timeframe.

Post by **dahnielson** » Tue Feb 05, 2008 11:12 pm

Talking about small memory footprint this was done with a 32 MB sound bank at disposal, but I don't think all sample where actually used...

(Wish I could say I had programmed it, but sadly no, it's a demo.)

Post by **Consul** » Fri Feb 08, 2008 10:59 pm

Thoughts on how I might make a didgeridoo library (let's see if I can explain this well):

Each drone will start with the "attack" (inasmuch as you can call it that), leading into a nice long drone. Then after the drone, various articulations will be played at various speeds. The final wave file for that didge will likely be very large. Then, loop points would be selected around the steady drone and around each articulation.

To play, press the key for the note wanted once (key latching), and use a number of other keys on the keyboard to access articulations. The idea is, as you press a control key (this is the key-switching part) the loop would immediately switch, via a fast crossfade, to the chosen articulation, before returning to the drone, also via crossfade. So, one could basically play all of the available articulations via the keyboard.

The important features I would need for this are multiple loops, crossfading loops, key-latching, and keyswitch-controlled loop points with scripting control for the loop return. I'm sure I'll think of more.

frink · Post by **frink** » Wed Mar 05, 2008 12:53 am

After thinking about this for the past two months I'm finally ready to share my thoughts:

To accurately represent a soloist and her instrument is one of the greatest challenges of sampling. Each sample voice must allow a full range of articulations. It's a bit daunting to think of 1 million little sound files. But that is about how many it takes to approximate a live performer. My goal here is to show how far we have to go in order to produce perfect sample representation. This specification is not exhaustive but it is a draft of what I see a sample file being in a perfect world.

Few instruments have more than 12 distinct articulated tones. Some instruments produce different sounds depending on the amount of time they are played. Some have different ways to transition between notes. Bowed instruments have record slides at several speeds to maintain an accurate tremelo. Each note must be recorded at all seven dynamic velocity. (pp, p, mp, m, mf, f, ff) Round robin variations should be recorded at least for attack so that no sample is ever repeated in rapid play. Therefore I propose the following specification as an example of how many files and how much logic is needed to pull off an acurrate performance.

1 Voice:
12 Articulation
88 Notes
5 Note Variations
5 Note Transitions
88 Transition Notes
5 Transition Note Variations

Most instruments will never use the full range of this specification.
(1 X 12 X 88 X 5): 5280 Articulated notes
(1 X 12 X 88 X 5 X 88 X 5): 232,3200 Transition notes

If we were trying to sample woodwinds we would also need to take into account the breathless state of the player. Thus, if we
created samples for each articulation and transition in four breathless states it would produce a sample file of around 1 million sample. However, This sample would fully display the capabilities of not only the individual instrument but the individual player as well.

In bowed instruments we need an easy way to script bow logic. We can record both up and down stroke samples but in order to be completely accurate we must specify which stroke should be used. We must also have loop synchronization so that when
a viola player changes directions while bowing two strings both notes are effected in unison.

We of course need, key groups and latch modes for drums and loops. We need multiple loops regions on a single sound. This is probably better served through scripted logic than hard and fast region markers.

Sample articulations need to be cross-faded rather than switched. This allows for more truthful calling of the effect in question. We could simply allow MIDI control using a number of sliders but it would be better to embed cross-fade logic scripts in the sample.

I think that building this using the full DLS specification including the things that GIG left out would be a place to start. However, the biggest thing that we need is the power of scripting. I think embedding a scripting language for this purpose would be idea. Lua might be good or perhaps Niquist, PD, CSound or some other music scripting language.

To use processing instead of pure sampling transitions might be scriptable. It would be wonderful to have the ability to do realtime sound stretching such so that slide times. Could be used on the fly. We might only allow shrinking since it is a subtractive action.

I'm proposing very heavy computing that may not work realtime in the samples. The amount of disk I/O involved or the ram quickly make this info available is quite high. My point is to allow accurate capture of sound in the sample files. There may be ways to speed up the realtime performance for low power systems by scripting the features instead of using the recording. However, the recording will always sound more human than a computer. Therefore I propose that we allow render operations in linuxsampler to allow lookahead capability. These functions should be passed to a logic script that will allow the choice of things such as breath marks and phrasing. It will also allow more process intensive scripts to run without needing realtime speed.

Naturally, we would want to create an auto-generation script rather than editing each of these samples by hand. With the rich command line environment in linux we can dream of samples this big. It would make sense if we could use the same script language that we embed in the samples themselves. The ability to process input streams and "compile" samples would make the act of creating samples much easier.

Post by **dahnielson** » Sun Mar 09, 2008 3:18 pm

Sorry, I have been away but is now back home. Don't have time to answer in the extent I wished, these things really excites me and have been on my mind for the last seven years (recently my interest has also turned to additive resynthesis).

Just to point you in the direction of previous work, and for inspiration, take a look at the technique behind the now discontinued Garritan Stradivari and Gofriller developed by Giorgio Tommasini implemented in Kontakt. He is doing some great work with sample based modeling using a technique with harmonically aligned samples. Part of the technique is apparently patent-encumbered (well, at least have a patent pending status). The Garritan instruments was afaik based on recordings not made specifically for the modeling process (for instance there's a Strad in GPO, guess it's from the same recording session). Tomassini is now working together with Peter Siedlaczek.

One observation is that you in addition to cross-fading dynamic layers can have separate samples for the attack from the sustained note in different dynamics making it possible to put them together on the fly.

A project to capture an instrument more truthfully is a good example of what the next-generation sampler engine in LS should be about (search the forum for more details and rantings about it).

Just one thing I'm unsure if you have overlooked: transitions, I'm guessing you refer to legato? One problem has traditionally been that each legato sample is generally tied to at least one target note and possibly a starting note making "performance sets" quite large in addition to being tied to a tempo (usually two or three tempos are recorded).

Something not to be afraid of is to employ filtering and modulation to samples. As long as you know what you do. For instance, by knowing if a player of an instrument would change the pitch or amplitude to create a vibrato I once did a usable manipulation of the samples I had, making the vibrato changeable on the fly. (Of course having real vibrato samples is always better, but my point was the possibly merging by modulation.)

A lot also comes down to how to control the virtual instrument. I'm a big fan of using both the keyboard, modulation wheel and expression pedal (and breath controller instead of modulation wheel for brass/wind instruments) and let the performance scripting sort out what the sound produced by the combined input from all controls should sound like.

frink · Post by **frink** » Mon Mar 24, 2008 11:38 pm

Sorry I've been away for a while as well. (Development keeping me busy and all.) I understand about applying modulations and synthesis and I certainly do not want to diminish for these demands in the spec for a new sampling engine. However, I'm coming at things from a classic sampler point of view. I'm trying to avoid the excessive post processing that is normal in modern samplers. This could be compared to a graphics project I finished yesterday doing up-sampling on a photo from 2MP to just over 12MP. It required an intelligent being to sit down and repaint most of the photo to achieve a look of a real photo at 600DPI and poster size. It can be done but it's difficult and time consuming to get it right. It would have been much easier if someone had given me a a larger photograph. Why not take the photo at 12 mega-pixels to begin with?

The same is true for audio transition samples. Transitioning using a tied note, slur or slide can produce drastically different sounds that could be modelled on extreme study. However, it takes a lot of work that in my opinion should not be necessary. My dream is to allow the average musician to walk into a studio and spend 20-or-so hours to produce a life-like sampled representation of HER playing. The biggest point where sampled instruments fall down is in transition between notes. Certainly, if we want to allow the musician who is performing with the final samples to use advanced breath control we must provide that through modelling. However, most of the time the composer who use our sampled instrument does so because he does NOT know how to play that instrument with any sort of grace.

Up to this point we have been focused on capturing the sound of an instrument with the act of sampling. What I'm suggesting now is a paradigm shift to the capture of an actual performance of both instrument and musician. This happens in much smaller time slices than traditional recording and requires much more intelligent logic than traditional sampling. Both are now at our fingertips with modern computing. However, up until now, we have been focused at getting only the notes and then morphing between them with various processing. I'm looking for the day when we can refine the science of sampling to such an extent that we can create a critique-accurate representation of an instrumentalist and his instrument. Really it is much the same goal as speech synthesis; although I think we are much closer in music than in speech at this point. The transition between notes is the biggest thing that is unique to an instrument/musician combination. There is no reason that we could not record such an extensive library of samples. It is then left in the hands of the creators of sampling formats and sample engines to make the playback of such samples feasibly possible.

Sadly, the only way to get a performance-accurate sample of each note transition is to record each at various speeds and dynamics. Sampled music is very like mosaic artwork. We seek to find the sample which most accurately depicts the real performance and splice it into our creation. Many of us involved with computer music look for the day when a computer can approximate a human performance. We are doing basic synthetic performances now with orchestral samples but individual instruments, for-the-most-part, leave a lot to be desired. This is also partly because of the limitations of MIDI in it's representation of performance. Sheet music still holds a much more rich set of instructions than basic MIDI. I think that this is mostly due to the paradigm of the original people who created music electronically. They were not interested at the time with producing accurate representations of existing instruments but rather creating new previously non-existent sounds.

I predict that in the coming years sample technology will fork into two separate directions. There will still be a vein that focuses on real-time performance samples. However, more and more music technicians are dreaming about the power and grace of rendered sample splices with artificial intelligence applied to the performance. If for example, I can send a MIDI file or a piece of sheet music and let the computer study it and apply some intelligent logic referring to it's vast sample catalog I am likely to have a much more realistic presentation than if the computer has to produce the sounds on-the-fly not knowing what notes are coming before or after and having no cognisant understanding of the other instruments and their parts. This level of AI is only beginning to be possible on the ordinary home computer. But this performance-realistic paradigm has not been the focus of much digital art with the exception of graphics animation. (The Massive character engine used in Lord of the Rings is an example of this sort of AI)

In the beginning of synthesised music we focused on producing sounds from oscillators using processing to shape them. Now I'm suggesting that we look at taking sounds recordings from their original sources and attempt to micro-splice them together to produce performer-accurate representations of what is possible in through that instrument. Essentially beat splicing at a micro level with crossfades to hide the splicework. The requirements I stated above would allow us to do at a sampling level what we have been doing for years at a processing level. This will allow a more human origin for performance and the warmth of the anomalies of acoustic music. i.e. warm-blooded sound.

I have no doubt that eventually we will get to the point of complete modelling synthesis. But even then I suspect that the human ear will prefer an imperfect human performance to the perfect models of computer technology. Therefore, I have focused on outlining a spec that would allow the sampling world to begin to move in this direction and begin the great paradigm shift away from instrument capture to performance capture. In this way, we will be able to produce more organic sounding recordings without giving up the advantages of our new digital tools.

I hope this unzips my brain a little and explains why my requirements are so massive. It's all about humanity even in our imperfect state. It will never cease to amaze me when I see a child discover that he can make beautiful sound from his own body with one implement or another. I wish that every child can one day have the feeling of directing her own orchestra and learn the beauty of making music together.

- FRiNK

frink · Post by **frink** » Tue Mar 25, 2008 12:05 am

I just listened to the violin you refer to. This is very similar to stuff I've heard from Big Fish and London Solo Strings. I think the Stratavari tops it but very good. Still what I'm looking for is a way for solo musician to participate in open source music by providing recordings of themselves playing their instruments that can be turned into sampled instruments. A true open orchestra that perhaps has chairs and everything. We need to provide a format that automates sample production and allows us to merely fine-tune the computer generated collection of samples. Much of what I wrote above was after writing a full specification for an open orchestra. I've thought of getting involved with the open orchestra project but it seems to have stagnated and may be better to start my own rather than seek to resurrect that one...

Post by **dahnielson** » Tue Mar 25, 2008 10:48 pm

frink wrote:Certainly, if we want to allow the musician who is performing with the final samples to use advanced breath control we must provide that through modelling.

The use of a breath controller don't require "modeling" as in "synthesis". It is quite interchangeable with the modulations wheel if you lack one (and its controller values can be edited/generated in the sequencer) but makes more sense for a lot of instruments and frees up the modwheel for other purposes. The "modeling" controlled by any controller, be it keyboard, pitch wheel, mod wheel, sustain pedal, expression pedal or breath controller should be "behavioral modeling" (usually achieved through scripting and/or layers/dimensions) which is applicable to all sample based instruments.

frink wrote:However, most of the time the composer who use our sampled instrument does so because he does NOT know how to play that instrument with any sort of grace.

I hope you don't imply that your prospective audience is a bunch of raging monkeys randomly hammering their keyboards sans grace.

On a more serious note, the use of a breath controller to play a sampled horn is nothing like learning to play the actual horn (I know, I started to play the horn at 7). The only learning curve will be learning how to use the controller itself and how it affects the sampled instrument in the same way you have to learn how to work the modwheel for a four-layer dynamic instrument in VSL (undoubtedly seriously sample based).

frink wrote:Up to this point we have been focused on capturing the sound of an instrument with the act of sampling. What I'm suggesting now is a paradigm shift to the capture of an actual performance of both instrument and musician.

Alex will probably now accuse me of being nitpicking, but the act of sampling has always captured both the instrument and performer in addition to the room, microphone and pre-amplifier characteristics.

Don't get me wrong, I'm just yanking your chain. I see your point.

frink wrote:This is also partly because of the limitations of MIDI in it's representation of performance. Sheet music still holds a much more rich set of instructions than basic MIDI.

Agree. I don't know if HD-MIDI (or whatever it will be called) will rectify it. Anyone here sitting on a link with a good desciption of it without having to saw off and give MMA the upper part of your arm?

And apropos sheet music, everyone should take an inspirational look at the way Lilypond handle music.

frink wrote:I think that this is mostly due to the paradigm of the original people who created music electronically. They were not interested at the time with producing accurate representations of existing instruments but rather creating new previously non-existent sounds.

Yes. Remember that MIDI was created to control a bank of synths replacing CVs and the infamous "wall of synths". Sampling was just in its infancy when MIDI was created in the early 80's. Sure, Fairlight CMI had been released five years before publication of the MIDI 1.0 specification, but as a sampler it was fairly shitty despite its historical importance. Hardware samplers and sample based sound modules wouldn't become any good and affordable until the early 90's.

frink wrote:I predict that in the coming years sample technology will fork into two separate directions. There will still be a vein that focuses on real-time performance samples. However, more and more music technicians are dreaming about the power and grace of rendered sample splices with artificial intelligence applied to the performance. If for example, I can send a MIDI file or a piece of sheet music and let the computer study it and apply some intelligent logic referring to it's vast sample catalog I am likely to have a much more realistic presentation than if the computer has to produce the sounds on-the-fly not knowing what notes are coming before or after and having no cognisant understanding of the other instruments and their parts.

I don't see it as a necessary split, only a matter of sensible decoupling. If you have a very controllable real-time performance instrument then you can have an AI do the performance using the same interface/protocol as any human would. Of course better performances have been right, left and center in academic research with tools like Csound, its predecessors and similar toolsets being developed over the years.

BTW, a MIDI file (or to be specific a SMF) should be considered being a recorded performance and not notation. A program like Rosegarden has a very clever custom event system (internally supplants the MIDI events) to deal with both un-tight human performance values and the rigidity of sheet music values at the same time. On a second note, some of the most talented (and even less talented) sample based music "realizators" (wish that word existed in English) do "program" their pieces by performing each part and only doing minor editing of tempo maps, keyswitches and stray notes.

frink wrote:In the beginning of synthesised music we focused on producing sounds from oscillators using processing to shape them. Now I'm suggesting that we look at taking sounds recordings from their original sources and attempt to micro-splice them together to produce performer-accurate representations of what is possible in through that instrument. Essentially beat splicing at a micro level with crossfades to hide the splicework.

Now we're talking!

Just a quick and very broad historic rundown for everyones enjoyment and not to be a nitpicking besserwisser:

It begun back in the late 1800s and early 1900s with additive synthesis, but it turned out to be much of a burden for the creator/programmer of the virtual instruments (and a technical nightmare to be realistic). Attention then turned to subtractive synthesis that was easy to implement and program but didn't produce any great realism (which was not necessarily the goal either). FM synthesis was introduced but it was difficult to program for most people. Then sampling made inroads and companies like E-MU knew how to combine the best of both worlds in terms of raw sampling and modulation from synthesis. But all the great hardware samplers (there were some great AKAI, Yamaha and E-MU's in the late 90's) were instantly killed by the release of Nemesys GigaSampler, which was actually really basic (actually extremely basic compared to many of the hardware samplers available) whose only differencing feature was disk streaming capable of playing back really long samples (no more looping!). Native Instruments release of Kontakt 2 rectified the situation by combining a flexible sampler with features such as scripting (for behavior) in addition to disk streaming. With instruments like Wallander's WIVI (although it's mainly a commercializations and improvement of academic techniques developed in the 80's) we have somewhat come full circle back to where it started using additive synthesis, only now backed by the power of computers capable of reconstructing the instrument from actual samples.

frink wrote:I have no doubt that eventually we will get to the point of complete modelling synthesis.

Yes. But it has a long way to go. The question is: Will it be necessary? Especially with methods such as Tomasini's "sample modeling" and Wallander's analysis/resynthesis based on anechoic samples of instruments. The first one being in essence analogue to your "micro-splicing".

Just to clarify my previous language here and elsewhere:

* When I talk about synthesizing it do include sample based methods as a subset. It's just a form of synthesis/resynthesis using sampled waveforms as oscillators instead of generated ones.

* When I talk about modulation I'm speaking about a controller manipulating the parameter of some other controller (like my hand modulating the volume knob on the stereo for some amplitude modulation) which can be used for behavioral modeling. See my lauding of E-MU in one of the "Next Generation" threads.

frink wrote:I just listened to the violin you refer to. This is very similar to stuff I've heard from Big Fish and London Solo Strings. I think the Stratavari tops it but very good.

FYI, the Garritan Stradivari violin and Gofriller cello have been discontinued and the engineers behind it have moved on to take the technique further:

http://www.samplemodeling.com

frink wrote:Still what I'm looking for is a way for solo musician to participate in open source music by providing recordings of themselves playing their instruments that can be turned into sampled instruments. A true open orchestra that perhaps has chairs and everything. We need to provide a format that automates sample production and allows us to merely fine-tune the computer generated collection of samples. Much of what I wrote above was after writing a full specification for an open orchestra.

Yes, it has crossed my mind too. This is what currently fuels much of my research into sample recording (on a budget) utilizing an approximated free-field condition over a reflecting plane: Making it possible for people around the world to produce samples in equivalent settings by a defined and standardized recording method.

Automation can be useful in the process. But I think that manual editing will always be necessary to some degree (just like 3D replicators are cool but not always applicable). What the free and open source method can offer in the meantime is a distribution of workload so that the producer of samples not necessarily need to edit anything him- or herself, similar to how translation and localization of FLOSS software are done today. Harness the power of the crowd!

frink wrote:I've thought of getting involved with the open orchestra project but it seems to have stagnated and may be better to start my own rather than seek to resurrect that one...

That's the right attitude!

It's alway better to scratch your own itch as a start instead of waiting on someone else do it for you. A project need some traction to build momentum.

Post by **dahnielson** » Wed Mar 26, 2008 1:37 am

Maybe also worth posting this thread regarding (among other things) my own plans for sample recording automation:

http://www.northernsounds.com/forum/sho ... age_551972

Post by **Consul** » Wed Mar 26, 2008 3:02 pm

And here I am, the exact opposite. I want a sampler that allows me new and different ways to tweak, mangle, and otherwise process samples to make new sounds. My big inspiration has been the videos on the Omnisphere from Spectrasonics. Other than that, though, I do want some more traditional functionality, and maybe some new ways to make realistic performances of acoustic sounds (drums are one example).

Maybe the solution to this is going to be two separate samplers.

bb.linuxsampler.org

Ideas for new sample libraries

Re: Ideas for new sample libraries

Re: Ideas for new sample libraries

Re: Ideas for new sample libraries

1 Million Samples - not really a big number

Re: Ideas for new sample libraries

Re: Ideas for new sample libraries

Re: Ideas for new sample libraries

Re: Ideas for new sample libraries

Re: Ideas for new sample libraries

Re: Ideas for new sample libraries