VF-SSDPCM1 Super Plus by Algotech

screenshot added by algorithm on 2018-04-06 21:02:35

platform :	Commodore 64 Commodore 64
type :	demo demo
release date :	april 2018

popularity : 56%

56%

0.85

alltime top: #14488

algorithm [code, sampling]

added on the 2018-04-06 21:02:35 by algorithm

popularity helper

increase the popularity of this prod by spreading this URL:

or via: facebook twitter pinterest tumblr bluesky threads

comments

Technical details as follows

Another one of them streaming sample demos
------------------------------------------
I would like to call it more "proof of concept" and a demonstration of the output produced by the encoder. In a few words, this demonstrates the successor to ssdpcm1-super with increased quality at nearly half the size. The aim is about size reduction and increased quality in comparison to the ssdpcm1-super method (and not to be compared with the far audibly higher quality ssdpcm2 v3) which produces larger packed files.

In its current stage, there are still some other enhancements that can be made to it that will more than likely be used as part of a demo (Although not taking all this amount of space as this demo does)

History of SSDPCM1 and variants
-------------------------------
SSDPCM2 will not be included in this section, only SSDPCM1.

SSDPCM1 is a 1bit sample shaping method which rebuilds the sample by incrementing or decrementing the current sample via step sizes that are changed and adjusted based on the changing characteristics of a sample

The very first demonstration of the SSDPCM1 method was used in my demo "channels" This method was the most basic and would merely select one step size for a given chunk and attach this to a stream of 1bit packed data. This single step size would serve as a positive and negative value. The stepsize values would change per chunk to accomodate changes in the waveform more precisely (Rather than just one constant step value for the entire sample stream)

the bitstream would then be read and a setbit would indicate to add the step size to the current sample and a clear bit to subtract from the current sample.

It worked pretty well but would result in quite a lot of artifacts when dealing with more complex waveforms

The second demonstration of ssdpcm1 was used as the end part for my algodreams demo. The decoder worked in exactly the same way, however the encoder would have 8 bytes lookahead allowing worsening changes if required that would result in less errors later. Quality was noticably improved and no change would be required in the encoder.

For both the above methods, step values were updated per 64 or 128 samples. It really was not feasable to have step size per frame or so (e.g for 8400hz, one step size per 168 samples). Quality would suffer.

Then there was a drastic change to the SSDPCM1 method and was known as SSDPCM1-Super. (This was demonstrated in my "axel f" demo.

This method would increase the file size (but not by that much) and quality would be a lot higher.

The encoder would brute force two step values per chunk and would choose which one of the two to go "through the path" per 8 bytes.

The advantage in this approach was that it would require only 1bit of additional data per 8 bytes which would give the decoder the corrent step value to use to decode 8 bytes.

Overall, the added file size would then be (((samplerate/50)/8)/8)+2 bytes per frame added to the bitstream (samplerate/50)/8

In the case of comparing it to ssdpcm1 at 8400hz for example, following file sizes would be per second

SSDPCM1 1100 bytes
SSDPCM1-Super 1350 bytes

As the fidelity was improved vastly, I could get away with changing step sizes only once per frame (this would also make the decoder more faster). In the case of the Axel F demo, I was updating at 10800hz which equates to only 1650 bytes per second packed.

And now its time for VF-SSDPCM1-Super..

So what is this VF-SSDPCM1-Super Plus?
--------------------------------------
File sizes up to half as small as SSDPCM1-Super but at even higher quality. How does it work?

The encoder analyses the spectral content of each chunk and its neighbouring chunks and using psycoaccoustic masking, it determines whether or not to halve or quarter the size of the chunk this is also based on the amount of low/medium/high frequency strength within a chunk.

If there are only low frequencies present, lets assume <2000hz with an original sample rate of 8000hz, we can get away by halving the sample rate to 4000hz without affecting quality too much (and less issue of aliasing due to non-existance or weaker signals >2000hz.

Hence per chunk, the samples get reduced in size and expanded by the decoder to reproduce the sample.

Now that is pretty straightforward. The key however is to reduce the pumping or fluttering that would occur when decoding these mixed chunks. This can be achieved by more post processing but the aim is to have low complexity decode using similar cpu time as the older method (Some experiments were done with interpolation and tweening which worked, but used considerably high amount of cpu time)

As mentioned previously, the encoder analyses the spectral content of the chunk taking into consideration its neighbouring chunks. If the current chunk has more high frequency content than the lower frequencies, but the next chunk has more low frequency content than high, what would the decision be for the current chunk? We compare threshold values between the frequencies looking ahead of time and determine the course of action whether it is to resample that high frequency chunk to low (which would give aliasing) or to retain the high frequency which would cause some transition issue when changing back to low after.

All this data is packed using a much improved version of SDDPCM1-Super which now operates on 4 unique step values per chunk rather than just 2 (which were negative and positive of each other)

For the above enhancement, it only needs two additional bytes per chunk but at the expense of extremely long encoding times due to brute forcing.

To lower the encoding time, the maximum step value boundaries to brute force are reduced or increased per chunk based on its frequency content. So lets now go onto the file sizes.

Lets assume a sample rate of 10400hz.

Full frequency packed data per chunk (frame in this case) would be packed to 34 bytes a frame expanding to 208 bytes when depacked
Half frequency packed data per chunk (frame in this case) would be packed to 19 bytes a frame expanding to 104 bytes when depacked but either applying interpolation or byte doubling to fit 208 bytes
Quarter frequency packed data per chunk (frame in this case) would be packed to 11 bytes a frame expanding to 48 bytes when depacked and then stretching to 208 bytes (with or without interpolation)

The encoder uses tolerance values to determine the minimum averaged amplitude of the frequency bands where it can trigger very low/mid/high encoding of the frequencies, hence compression can be tweaked.

Overall lets sum up some compression ratios of SSDPCM1, SSDPCM1-Super and VF-SSDPCM1-Super Plus (For sample rate of 10400hz)

SSDPCM1 1350 bytes per second 7.7:1
SSDPCM1-Super 1600 bytes per second 6.5:1

Now with the VF-SSDPCM1-Super method, the bytes per second would vary based on the verylow/mid/high frequency content and tolerences in the encoder. It can be as low as 550 bytes per frame or as high as 1700 bytes per frame. Based on most audio content, inbetween value of 1000-1100 bytes per second would be the average file size. (Around 10:1) Not bad for something that is higher in fidelity to the SSDPCM1-Super.

Lets get onto the details of the demo
-------------------------------------
Proof of concept. Its just some text with a madonna picture and some real time visualiser (crude blocks) on top. Its to demonstrate the VF-SSDPCM1-Super Plus method, nothing more or less.

The madonna track used in the demo is the full audio from start to end with barely anything missed out (with only some subtle sections if you notice).

As it contains some repeating chorus and verse sections, i was able to reduce the amount of samples. However this still equated to over two minutes of unique samples.

Each 4 bar pattern was then packed giving an approximate compression of 9:1 or 10:1

The previous version utilised a very small decoder which would push decoded bytes to the stack. However i decided to opt for a page switching non-stackless routine which was unrolled (and occupied approximately 8k of ram). This also used less cpu usage than the stack push method (albeit using much more ram)

The visualiser is in realtime and is based on the actual output of the decoded samples at that moment

I have used an updated scheduler in the stream chunk loading which simplifies the creation of segment order for the sample loading.

Previously i would have trigger points to load relevant packed samples to required slots and then to play these back in specific order which did work well but was quite a pain to manually put together.

This new scheduler simply just forwards the relevant 4 segment pattern data to the scheduler which then performs the loading (or copying from other slots if they already exist in other places) to prevent reloading of that required sample segment. This is all done while the decoder is playing back previously loaded segments.

Can it be optimised and improved further
---------------------------
Indeed it can. The actual decoder i am using is a branchless method (which is more suited for interleaved decode). It was quick to put together and was able to stream in time, hence i used this. This is a testbed for the more optimised decoder/improvements to come when its utilised as part of a real demo.

One thing to note however is that there is always a tradeoff between cpu usage and possibility of streaming from floppy. To allow free streaming with no buffer overrun, the cpu usage in decode needs to be low AND file sizes of packed data needs to be below a specific amount in order to do this, and if requiring a full song with vocals, final file size is of importance in particular when realising that there is not much storage space available.

There are ways of getting around high cpu usage and lower disk load speed, but that involves prebuffering many segments and repeating loops (look at my SSDCPM 16khz demo as an example :-))

I was considering subsets of 2 or 1 bit streams with step values, but my aim was to have filesize nearer to the 1k mark and to achieve this via the waveshaping method, i opted for the frequency adjustments instead of controlbit adjustments

Sample quality issues
---------------------
Yes, due to the high compression, sample quality will suffer (but the goal was to improve on ssdpcm1-super with much higher compression and that was achieved)

However as a reminder. This demo is only for 8580 NEWSID. Dualsid not supported (and many issues with this due to detection and even if correctly detected, filter caps may have more variance). Nonetheless, there is audodetect for old sid too, but in most cases, it will sound much worse. If you have a digiboost mod on your sid, then not much point running the demo unless you want some severe distortion.

Digimax is supported (If you have one of these devices at $DE00) then hold down space when the notice comes up at the bottom of the screen until the screen turns black

If using an emulator, Use Resid and 8580 (If using Vice). Micro64 is fine using 8580. HoxS64 does not seem to work (due to incomplete drive emulation) and even if it does work in future, make sure to turn off digiboost

Other Info
----------
Thanks to krill, where full credit for code goes to him for the loader. If you experience any issues (unless it is 1541u1 or sd2iec) please get in touch. If copying it run on a real floppy, please make sure that you copy the full 40 tracks and not the default.

added on the 2018-04-06 21:02:53 by algorithm

Cool! Excellent sample quality and proper tune. :)

rulez added on the 2018-04-06 22:21:15 by StingRay