Sunday, March 19, 2017

Experimenting with Pi3 optimisations - POT 5.6.0-beta1 with Qt 5.8.0 built for armv8 with GCC 4.9.4

At the moment I still don't want to test custom firmwares or 64 bit arch builds, but I started to test a couple of new features: a new compiler from Linaro (the one provided by the foundation keeps giving me headaches), version 4.9.4 instead of 4.8, and optimised compiler flags for the Rapsberry Pi 3, which is an armv8.
In this build, Qt, ffmpeg and POT are all built with 4.9.4 Linaro toolchain and optimised compiler flags for Pi3. This will only work on Pi3.
You won't probably see much difference in GPU intensive apps, but it is a step on the road of optimisation!

Have fun! Bye! ;-)

Download the toolchain here.
Download POT 5.6.0-beta1 for Raspbian Jessie Lite Pi3 here (md5: 0eec41ef02e9369fc7e569030b8ff868).

Saturday, January 28, 2017

Hardware Decoding in Chromium through QtWebEngine on Raspberry Pi

A few months back I decided to try to implement hardware decoding in WebKit. Unfortunately this task is always pretty long and complex for many reasons. I found the time to draft an implementation for WebKit1, which is pretty useless as WebKit1 in Qt is only used outside QML and JS is executed in the main thread. Unfortunately I never found the time to implement this in WebKit2, which runs in QML and is suitable for more fluid UIs. This was the result:
This is how YouTube was running with this implementation:

Now Qt has deprecated QtWebKit and is working heavily on QtWebEngine, which is built on Chromium, so I wanted to try this road. Unfortunately these kind of things always claim a lot of time, and I don't typically have that much, but I was able to start and get something done already.

Writing a complete solution in Chromium to decode and render video takes much time, so I thought of a shortcut: creating a custom VDA (Video Decode Accelerator) that loads the POT library and reusing its entire codebase to implement decode and rendering with little modifications. This proved to be possible and now I get something on the screen.

So, to summarise: a little patch to Chromium is needed to create a VDA that dynamically loads POT library into memory and uses it with a common interface. Data and calls are translated to POT structures and are sent to POT, which then processes the buffers properly. The result of the decode operation is then sent back to the VDA through the same interface and textures are then sent to Chromium for rendering.

Still many problems remain open, there is much to be done yet as you can see from the video, but something is drawn. Have a look at the demo:

Have fun! ;-)