Saturday, February 16. 2013
Testing WebRTC
The last months I have been following a new project very related with HTML5, WebRTC (Web Real-Time Communication). The WebRTC is an open project with the aim of converting the web browser in a Real-Time Communications (RTC) device via JavaScript APIs. You already know that I talked before in this blog about VoIP and Web in an interesting (at least for me) two entries series. WebRTC is being standardized by the W3C (World Wide Web Consortium). It is backed mainly by Google although Firefox and Opera quickly joined developing their own implementations. Microsoft was reluctant at first but finally has adopted WebRTC but, as usual, doing it their own way. The Microsoft's result is CU-WebRTC which in general is WebRTC but changing codecs and other minor things (do you see any similarity to the video HTML5 tag problems?). About this subject Apple with Safari is currently missing in action but it is easy to guess what will be their position.
When I knew about WebRTC there was a good introduction in youtube and, although things are changing very quickly, I still recommend it to get a brief understanding of the idea. Now things are getting more and more specific but, if you check the main WebRTC page, all the JS functions are still prefixed in all browser implementations (webkit*, moz* and so on) because there is no definitive standard API. When I started to look at the project I did not understand the whole interaction among all the different elements, mainly my doubts were about the web server role.
The past week I realized that firefox and chrome developers are working together in order to get real interaction, there is even a demo application to test both implementations. I was studying how the application works and what is the real function of browsers and servers in WebRTC implementation.
First I am going to comment the two different parts that WebRTC API is divided into:
JavaScript getUserMedia API that allows the browser to access camera and microphone hardware. The returned stream can be used to simply display it locally or send it over the network.
PeerConnection API that allows browsers to establish direct peer-to-peer (P2P) socket connections between them. The connection joins a browser directly to someone else's browser and exchanges data directly. P2P is very useful for high-bandwidth protocols like video and audio, because the server is freed from dealing with large amounts of data. Obviously concepts like STUN servers or media proxies should be considered in the underlying implementation of the API.
Both parts are needed in order to get WebRTC in your browser but they can be considered as different parts of the same API. After studying the demo application it works as follows:
As soon as a user enters in the application a room is assigned (a numeric identifier which is managed as a web parameter). The browser in this point gets the access to the local resources (camera and mic), operation which asks for permission (getUserMedia JavaScript part). It also creates a peer connection offer which is sent to the server (PeerConnection API part). This offer, I suppose, contains things like IPs, codecs, and all the things needed to establish a P2P connection.
The partner browser enters to the application specifying the same room (same parameter r=<ROOM_ID>) and performs the same operations (it accesses to local resources and sends another offer to the server). But this time the server finds that there is another user in the same room and both offers are exchanged.
At this point there is a PeerConnection negotiation to check how the communication between both browsers are going to be, taking the server as the man in the middle which lets both partners to interact (both partners know the server and can talk through it). Each offer is exchanged between the browsers and all the decisions needed to establish a P2P connection are agreed (I do not know for real what things are decided here). The application uses google channel for the communication between browsers and server.
After the negotiation both browsers can interact directly with each other and the video call (sound and video) is performed without any server interaction.
Finally, when any user leaves the page or hangs the call, the browser communicates again with the server in order to inform that this user has left the room (the server deletes the user from the room).
With this application I have finally understood what is the role of the server in WebRTC, it is the one that connects two particular clients/browsers (so concepts like presence that was commented in the previous entries about VoIP is now competency of the server application). Now I am going to comment the steps I performed to test the demo:
The application was checked out:
The demo works with the google app engine (python version), so I downloaded the SDK for linux and unzipped it.
$ unzip -q ~/Desktop/google_appengine_1.7.4.zip
The application was started but listening in all interfaces of my box:
$ google_appengine/dev_appserver.py --address=0.0.0.0 apprtc/
In order to test the application firefox nightly (FF21) and chrome beta (chrome 25) are said to be used. In theory chrome 23 (version 21 added getUserMedia API and the commented version 23 the PeerConnection one) and firefox 18 (that version added all WebRTC code in the mozilla browser) have the full WebRTC API but, it is clear, both implementations are very immature and they are evolving version by version (take in consideration that the standard itself is only a draft right now). The application has some tweaks in order to make the conversation possible.
So Chrome 25 (beta channel right now) and firefox 21 (nightly in these moments) were installed in my old laptop (the new one does not have a camera ) and in my desktop machine.
$ /opt/google/chrome/chrome --version Google Chrome 25.0.1364.68 $ ./firefox --version Mozilla Firefox 21.0a1
Launch chrome 25 in one box and access the application (http://192.168.1.131:8080 in my case). As soon as you enter the browser is redirected with a room identifier parameter (http://192.168.1.131:8080?r=<ROOM_ID>). The browser asks permission to access the camera and microphone and it waits for the other partner to join.
- Launch firefox 21 in the other box and access the application but using the same room id (put directly the url with the parameter, http://192.168.1.131:8080?r=<ROOM_ID>). Firefox asks for permissions too and as soon as it enters the communications starts.
$ svn checkout http://webrtc-samples.googlecode.com/svn/trunk/apprtc/
Firefox needs two options enabled in about:config page, media.navigator.enabled and media.peerconnection.enabled. In the downloaded version 21 only the second one was not set by default.
Here it is the video:
Today's entry is a little explanation of the new WebRTC project. As I commented before I did not understand it completely at the beginning and this entry dissects the new demo application just to review its basics. I think now it is clearer for me and I hope that I helped you to understand it too. Obviously WebRTC is emerging as a new HTML5 feature and (as in many of the previous features) dispute is assured among mayor players.
Web Wars forever!
Thursday, September 8. 2011
Testing WebGL
One of the common topics of this blog is the HTML5 standard and its multiple new features. Today I am going to test WebGL, which is a standard specification to add 3D graphics in a browser canvas element without using any plug-in or external component (so it is not part of HTML5 itself but depends on it). Obviously this feature is deeply connected to graphic hardware acceleration because any WebGL application needs to be GPU accelerated in order to run smoothly. WebGL is a breaking new feature in the web world and another element of great controversy among browsers. If you remember, the canvas element was already commented in a previous series of this blog, but there only 2D graphics were shown.
WebGL standard is being driven by the Khronos Group and all players in the web are involved (Apple, Google, Mozilla and Opera) except Microsoft. The corporation headquartered in Redmond rejects this standard and has declared several times that IE will not support it, they consider WebGL harmful in terms of security and they are not totally wrong.
The summary of the current situation is more or less the following. Chrome (initial webGL support in version 9) and firefox (version 4 also added WebGL) support WebGL in Windows (and I think Mac OS) and it is enabled by default. Opera does not support it although there is a WebGL preview version for its version 11.50 only for windows. On Mac OS Safari was adding the support in their WebKit nightly builds but maybe in Lion WebGL is already supported (sorry but I am not a Mac OS user). There are some summaries of the supportability matrix, for example in the Khronos page or in the first chapter of this tutorial, but this is a constantly changing subject so please always re-check. Nevertheless the main problem comes in the Linux world. Here there are a lot of problems related to the implementation of many graphic drivers, firefox reported issues with Linux GPU drivers and both chrome and firefox added a gpu blacklist to disable some features (WebGL mainly) depending the driver and the OS involved. Firefox 6 and 7 are supposed to remove more and more drivers from the blacklist, same behavior is expected in the next versions of chrome, as soon as linux driver implementations became more reliable. In summary current browsers in Linux hardly support the nvidia binary blob as the only out of the box WebGL driver (at least in my case my two boxes are blacklisted, my desktop uses gallium R300 driver and my laptop classic intel stack).
After this little introduction I am going to present a quick demo of WebGL. I am very, very interested in porting my PFC project to a web version. It is incredible to me how a 12 years old project (which had to be run inside a SGI server) can now be run inside a browser. If you remember what I told in that entry, I lost my code and all the data (mesh and textures) because of a broken CD, so this implementation is only a simple demo with new and fixed data (there is a lot of room for improvement). The demo has the following features:
- A fixed height mesh is retrieved using a static json file. This file contains the starting latitude and longitude of the mesh, the number of points (rows and columns), the length or step of each side, the matrix data and a lot of configuration parameters. This mesh is not drawn completely. In OpenGL and WebGL (they are very similar) the idea is simple, all the figures are sent to be drawn and depending the perspective, camera and other elements the engine draws what is needed. But in my PFC the mesh was so big that only a part of it was sent to be drawn. I have followed the same idea in this demo.
- Basic camera management. The camera is implemented using its own axis, this way the camera can turn and move following natural keystrokes and mouse movements.
- Tile representation and painting. In the real application every satellite image was the texture for a tile mesh (50x50 points). This way the part of the mesh that corresponded to an image was painted independently (the application drew each square of mesh with the associated texture). I have implemented the same here but, in the demo, there is only one tile (a 8x8 tile) which is repeated all the time horizontally and vertically (11 times in each direction).
- Easy texture loading. A simple FIFO queue has been implemented to load and unload images. Actually this is not used in the demo cos only one image is used (there is only one tile which is repeated, so only one texture is needed and the queue is never filled).
- Basic ambient and directional light for emulating the sun (in my PFC there were more environmental effects like sky, fog,...).
- Management of the configuration parameters. I have used the new input type number which is only supported in some browsers. HTML5 is everywhere.
The whole demo is below, integrated in the entry using an iframe. It is a pity that I cannot have at my disposal the real data from my PFC (damn CD). As I said a simple pattern (only 8x8 points) is repeated all the time (11x11 tiles). This way I do not upload a lot of images to the server. How to navigate using keyboard or mouse is explained in the right part of the page.
Remember what I have said before, in order to test this example you need a modern browser if you use Windows (firefox, chrome or the Opera 11 preview but not IE9). In Linux you need firefox or chromium but usually you need to disable the black list. Debian testing packages have just been upgraded to Chromium 13 and iceweasel 5 respectively and both support WebGL. In chromium the blacklist has to be ignored starting the browser with the following option:
$ chromium --ignore-gpu-blacklist
In Firefox I have changed some webgl settings (in the about:config page):
webgl.disabled | false (default) |
webgl.force-enabled | true |
webgl.force_osmesa | false (default) |
webgl.osmesalib | /usr/lib/x86_64-linux-gnu/libOSMesa.so.6 |
webgl.prefer-native-gl | true |
webgl.shader_validator | true (default) |
webgl.verbose | true |
And the browser has to be executed with the following environment variable:
$ MOZ_GLX_IGNORE_BLACKLIST=1 iceweasel
This entry is doubtless the one that I have spent more time working on it. I was developing the JavaScript code long hours and I have to admit that I started it some time ago (last months I have been quite busy and I was never in the mood to finish it). Documentation about WebGL is very poor and thanks to this good tutorial I could start the development, but when you need something special, custom or a mix of things you usually wasted much time just trying to identify how to do it. In summary current WebGL support is quite a mess (much more in the Linux world) but it is an essential feature to move gaming to the web. When I saw that quake was available inside a browser I was totally shocked. I personally think this technology is unstoppable, no matter how buggy or insecure it was. Needless to say, flash is the current option for all of this, so WebGL should be very buggy and very insecure to be unworthy.
Long life to the web wars!
Tuesday, August 23. 2011
Videos Are About to Change (Second Time Lucky)
Long time ago the way videos are presented in this blog were about to change. I was going to replace my useful flash flowplayer with the new HTML5 video tag. I changed my mind because video were not supported in any stable IE and there was some warfare about the supported codecs in the rest of the browsers.
At the same time I published that entry google made public the WEBM video+audio format, they created a webm project and since then the new codec has been quickly adopted by browsers. Only IE and Safari do not support WEBM by default, but this time google solved the issue announcing plugins for both OS integrated browsers (they are in a very early stage). Microsoft IE9 was released to the public in march of this year, with a lot of the new HTML5 features (included video and audio support). Iceweasel testing package was upgraded to 5.0 last week in Debian (the browser was still at 3.5 version before). Therefore all my requisites to trigger HTML5 solutions are accomplished and there is no reason to not change my videos.
From now on the videos are going to be presented using the video tag and only in WEBM format. Just like this:
<video src="out.webm" autobuffer controls> <img src="error.png" alt="Error!" /> Your browser does not support the video tag. See this <a href="/rickyepoderi/index.php?/archives/41-...">entry</a> for more information about how to see the videos in this blog. You can <a href="out.webm">download the file</a> instead. </video>
And this is my last video in the new format just as an example (video used in Backing Up Data Into an Encrypted External Disk entry):
So maybe you have reached this post because you cannot see my videos. At this moment browser status about video and WEBM is the following:
- Firefox: Firefox supports both video and WEBM since 4.0 version.
- Chrome: Both are supported since 6.0.
- Opera: Video supported since 10.50, WEBM since 10.60.
- IE: IE9 supports video tag but WEBM is not a valid format out of the box. You need to install a plugin offered by the webm project. Remember that IE9 cannot be installed in Windows XP.
- Safari: Video is supported since 3.1 but again WEBM needs an external codec. Here there are two options, the plugin from google and perian, an open source QuickTime component that adds native support for many popular video formats (it supports webm since 1.2.3). Some friends who have tested both solutions recommend the perian option (it seems perian is very common in the Mac world and google plugin gives some problems right now, sorry but I am not a mac guy).
- Others: If you are using any other browser you are a geek, so I hope that you know what you are doing and how to see an HTML5 video with WEBM format.
As a summary, in most cases if you cannot see the videos you only need to upgrade your browser to the last version. Besides if you are using Windows integrated IE9 or Mac OS integrated Safari you need to manually install a WEBM codec plugin (see previous links). Finally if you are using IE8 (or previous) in a Windows XP box I strongly advise you to change your browser for another one (please do not get stuck).
HTML5 is here! At least for my videos.
PS: Thanks to Luis who made some changes in the blog system installation to make this work (adding webm mimetype and IE9 compatibility meta tag). Thanks also to Jaime, Mariano, Nacho, Ramón and Victor (both) who tested the options in Mac OS / Safari.
Comments