Asking the Wrong Question?

Maybe this EU Referendum is asking the wrong question.

One week to go and I still don’t know what vote will be best. Though I am probably more of an “innie” than an “outie”.

There do seem to be some common concerns on both sides about not getting done in by faceless bureaucrats or greedy business leaders.

So I am realizing that perhaps we need answers to some other questions:

  • Why are there no politicians whom I can trust further than I can throw them?
  • Why should we have to fight SO hard to protect our society against business interests?

And maybe:

  • Are we ever going to stop voting with our wallets and realise its not all about economics?

Hmmm. Good Luck – we might need it.

ACCU2016: Talk on Software Architecture Design 4: A Design Example

[The following transcript is more for the techies of my readership. For those of a less technical inclination, feel free to wait for the next post on “Active Design Ideas” which I have separated out due to the length of this post.]

I want to underpin the philosophical aspect of this discussion by using an example software architecture and considering some design problems that I have experienced with multi-threaded video player pipelines. The issues I highlight could apply to many video player designs.

The following image is a highly simplified top-level schematic, the original being just an A4 pdf captured from a whiteboard, a tool that I find much better for working on designs than using any computer-based UML drawing tool. The gross motor movement of hand drawing ‘in the large’ seems to help the thinking process.

Player

There are 3 basic usual commands for controlling any video player that has random access along a video timeline:

  • Show a frame
  • Play
  • Stop

In this example there is a main controller thread that handles the commands and controlling the whole pipeline. I am going to conveniently ignore the hard problem of actually reading anything off a disk fast enough to keep a high resolution high frame-rate player fed with data!

The first operation for the pipeline to do is to render the display frames in a parallel manner. The results of these parallel operations, since they will likely be produced out of order, need to be made into an ordered image stream that can then be buffered ahead to cope with any operating system latencies. The buffered images are then transferred into an output video card, which has only a relatively small amount of video frame storage. This of course needs to be modeled in the software so that (a) you know when the card is full; and (b) you know when to switch the right frame to the output without producing nasty image tearing artefacts.

These are all standard elements you will get with many video player designs, but I want to highlight three design issues that I experienced in order to get an understanding of what I will later term an “Organising Principle”.

First there was slow operation resulting in non real-time playout. Second, occasionally you would get hanging playout or stuttering frames. Third, you could very occasionally get frame jitter on stopping.

Slow operation
Given what I said about Goethe and his concept of Delicate Empiricism, the very first thing to do was to reproduce the problem and collect data, i.e. measure the phenomenon WITHOUT jumping to conclusions. In this case it required the development of logging instrumentation software within the system – implemented in a way that did not disturb the real-time operation.

With this problem I initially found that the image processing threads were taking too long, though the processes were doing their job in time once they had their data. So it was slowing down BEFORE they could get to start their processing.

The processing relied on fairly large processing control structures that were built from some controlling metadata. This build process could take some time so these structures were cached with their access keyed by that metadata, which was a much smaller structure. Accessing this cache would occasionally take a long time and would give slow operation, seemingly of the image processing threads. This cache had only one mutex in its original design and this mutex was locked both for accessing the cache key and for building the data structure item. Thus when thread A was reading the cache to get at an already built data item, it would occasionally block behind thread B which was building a new data item. The single mutex was getting locked for too long while thread B built the new item and put it into the cache.

So now I knew exactly where the problem was. Notice the difference between the original assumption of the problem being with the image processing, rather than with the cache access.

It would have been all too easy to jump to an erroneous conclusion, especially prevalent in the Journeyman phase, and change what was thought to be the problem. Although such a change would not actually fix the real issue, it could have changed the behaviour and timing so that the problem may not present itself, thus looking like it was fixed. It would likely resurface months later – a costly and damaging process for any business.

In this case the solution here was to have finer grained mutexes: one for the key access into the cache and a separate one for accessing the data item, which was then lazily built on first access.

Hanging Playout or Stuttering Frames
The second bug was that the playout would either hang or stutter. This is a great example because it illustrates a principle that we need to learn when dealing with any streamed playout system.

The measurement technique in this case was extremely ‘old school’, simply printing data to a log output file. Of course only a few characters were output per frame, because at 60fps (a typical modern frame-rate) you only have 16ms per frame.

In this case the streaming at the output end of the pipeline was happening out of order, a bad fault for a video playout design. Depending upon how the implementation was done, it would either cause the whole player to hang or produce a stuttered playout. Finding the cause of this took a lot of analysis of the output logs and many changes to what was being logged. An example of needing to be clear about the limits of one’s knowledge and of properly identifying the data that next needed to be collected.

I found that there was an extra ‘hidden’ thread added within the output card handling layer in order to pass off some other output processing that required. However it turned out that there was no enforcement of frame streaming order. This meant that the (relatively) small amount of memory in the output card would get fully allocated and this would give rise to a gap in the output frame ordering. The output control stage was unable to fill the gap in the frame sequence with the correct frame, because there was no room in the output card for that frame. This would usually result in the playout hanging.

MindTheGapCropped

This is why, with a streaming pipeline, where you always have limited resources at some level, allocation of those resources MUST be done in streaming order. This is a dynamic principle that can take a lot of hard won experience to learn.

The usual Journeyman approach to such a problem is just to add more memory, i.e. more resource! This will hide the problem because though processing will still be done out of order, the spare capacity has been increased and it will not go wrong until you next modify the system to use more resource. At this point the following statement is usually made:

“But this has been working ok for years!”

The instructions I need to tell less experienced programmers when trying to debug such problems will usually include the following:

“Do not change any of the existing functionality.
Disturb the system as little as possible.
Keep the bug reproducible so you
can measure what is happening.
Then you will truly know when you have fixed the fault.”

Frame Jitter on Stop
The third fault case was an issue of frame jitter when stopping playout. The problem was that although the various buffers would get cleared, there could still be some frames ‘in flight’ in the handover threads. This is a classic multi-threading problem and one that needs careful thought.

In this case when it came time to show the frame at the current position, an existing playout had to be stopped and the correct frame would need to be processed for output. This correct frame for the current position would make its way through to the end of the pipeline, but could get queued behind a remnant frame from the original stopped playout. This remnant frame would most likely have been ahead of the stop position because of the pre-buffering that needed to take place. Then when it came time to re-enable the output frame viewing in order to show the correct frame, both frames would get displayed, with the playout remnant one being shown first. This manifested on the output as a frame jitter.

One likely fix of an inexperienced programmer would be to make the system sit around waiting for seconds while the buffers were cleared and possibly cleared again, just in case! (The truly awful “sleep” fix.) This is one of those cases where, again due to lack of deep analysis, a defensive programming strategy is used to try and force a fix of what initially seems to be the problem. Again, it is quite likely that this may SEEM to fix the problem, and is likely to happen if the developer is under heavy time pressure.

The final solution to this particular problem was to use the concept of uniquely identified commands, i.e. ‘command ids’. Thus each command from the controlling thread, whether it was a play request or a show frame request, would get a unique id. This id was then tagged on to each frame as it was passed through the pipeline. By using a low-level globally accessible (within the player) ‘valid command id set’ the various parts of the pipeline could decide, by looking at the tagged command id, if they had a valid frame that could be allowed through or quietly ignored.

When stopping the playout all that had to be done was to clear the buffers, remove the relevant id from the ‘valid command id set’ and this would allow any pesky remaining ‘in flight’ frames to be ignored since they had an invalid command id. This changed the stop behaviour from being an occasional, yet persistent bug, into a completely reliable operation and without the need for ‘sleep’ calls anywhere.

In the next post I will recap the above process of finding and fixing the problems from a human development perspective.

ACCU2016: Talk on Software Architecture Design 5: Active Design Ideas
ACCU2016: Talk on Software Architecture Design 3: The Issue of Doubt

Minimalist Mastery

YachtClubSmall.jpeg

The Yacht Club

Once again I have succumbed to buying another watercolour (above) by Jim Spencer,
while my own painting efforts are still “in progress”.

I just can’t help it with this artist.

He is a master of minimal technique, producing paintings that grow on me over time.

Paintings I can breathe into.

RedSkyAtNightSmall

Red Sky at Night

This mastery of minimalism to maximum effect is something close to my heart as you might know if you have read any of my thoughts about software development.
It is always something I aspire to, to the extent that I even use “Red Sky At Night”
in my talks to demonstrate the idea.

All I have to do now is get to that place with my painting!

Also for those who are still waiting for my ACCU talk transcripts.
I am still working on the next one.
Its amazing how long it takes to transcribe a 75 minute talk!

STUDY DIARIES: Truth & Knowledge Commentary

I have just come across a truly masterful treatise that gives a very cogent commentary of Steiner’s epistemological dissertation Truth & Knowledge, as well as some pointers to The Philosophy of Freedom.

Having concluded that it would be impossible to précis my own study work of the text I have absolutely no hesitation in recommending this paper. It is written by Ron Brady and was near publication when he died in 2003. The folks at the Nature Institute have published it in their Ron Brady Archive.

It takes the reader step by step through one of Steiner’s foundational texts and is written much more for the modern reader so is more approachable that Steiner’s original text, though you still need to keep your wits about you!

So many thanks to Ron and to the folks at the Nature Institute.

I am playing with the idea of perhaps itemising the main points in a future blog post.

An Artistic Day Out

Today was one of those wonderful, surprising and unplanned days. I decided to take a look at the art in the Open Studios 2016 event going on around Newbury.

First stop was to see a studio of an artist who looks like she is going to be one of my favourites. So much so that I am posting a couple of her pictures here from her SAA website with her permission.

Her name is Pearl Hailstone and it was lovely to turn up at her studio and get offered some tea and biccies(!) while we chatted and she gave me some great advice for my budding watercolour skills. I just love the colour contrast and loose style of her painting so have taken on her advice and will have to give it a go.

Thank you Pearl for such a warm welcome and a cuppa just when I needed it.

PearlHailstone_DartmoorTreesAndFence

Dartmoor Trees and Fence

PearlHailstone_TheOldLock

The Old Lock

Another artist of note was Sarah Moorcroft whose work I saw at the Insight exhibition at New Greenham Arts. It literally jumped off the wall and hit me in the eye! She has some stunning high colour contrast ink on paper work which is well worth a look down at Pineapple Palace.

It was great to see some art piece and then think, “Hmmm, how was that done?”, and then be able to actually go for a short drive to talk with the artist and ask them.

One of those great days when I felt I was in the best of the universe’s flow.

POEM: Too Easy

Too easy to expect respect
When you can’t respect yourself

Too easy to ask for love
When you can’t love yourself

Too easy to blame others
When you can’t forgive yourself

Too easy to treat others
As you would like to be treated yourself

But such is the counsel of fools
We are different
You might hate what I like
I might hate what you like

We are imperfect, flawed
Yet beautiful nonetheless
Mistake will follow on from mistake
Despite our best intentions – our best guess

We might not love ourselves enough
Yet we can talk – converse
And regardless of the pain that comes
Touch that light – that essence between us

We might not learn self-respect
But with eggshells strew the path
Whereon others cut their souls
And leave us facing a lonely hearth

We might not feel we can change
Nor face the hurt – the fear
But if we can learn its shape
We might find it easier to bear

But don’t expect others
To give you
What you cannot
Give yourself.

Bench

© Charles Tolman May 2016