Saturday, February 16. 2013
Testing WebRTC
The last months I have been following a new project very related with HTML5, WebRTC (Web Real-Time Communication). The WebRTC is an open project with the aim of converting the web browser in a Real-Time Communications (RTC) device via JavaScript APIs. You already know that I talked before in this blog about VoIP and Web in an interesting (at least for me) two entries series. WebRTC is being standardized by the W3C (World Wide Web Consortium). It is backed mainly by Google although Firefox and Opera quickly joined developing their own implementations. Microsoft was reluctant at first but finally has adopted WebRTC but, as usual, doing it their own way. The Microsoft's result is CU-WebRTC which in general is WebRTC but changing codecs and other minor things (do you see any similarity to the video HTML5 tag problems?). About this subject Apple with Safari is currently missing in action but it is easy to guess what will be their position.
When I knew about WebRTC there was a good introduction in youtube and, although things are changing very quickly, I still recommend it to get a brief understanding of the idea. Now things are getting more and more specific but, if you check the main WebRTC page, all the JS functions are still prefixed in all browser implementations (webkit*, moz* and so on) because there is no definitive standard API. When I started to look at the project I did not understand the whole interaction among all the different elements, mainly my doubts were about the web server role.
The past week I realized that firefox and chrome developers are working together in order to get real interaction, there is even a demo application to test both implementations. I was studying how the application works and what is the real function of browsers and servers in WebRTC implementation.
First I am going to comment the two different parts that WebRTC API is divided into:
JavaScript getUserMedia API that allows the browser to access camera and microphone hardware. The returned stream can be used to simply display it locally or send it over the network.
PeerConnection API that allows browsers to establish direct peer-to-peer (P2P) socket connections between them. The connection joins a browser directly to someone else's browser and exchanges data directly. P2P is very useful for high-bandwidth protocols like video and audio, because the server is freed from dealing with large amounts of data. Obviously concepts like STUN servers or media proxies should be considered in the underlying implementation of the API.
Both parts are needed in order to get WebRTC in your browser but they can be considered as different parts of the same API. After studying the demo application it works as follows:
As soon as a user enters in the application a room is assigned (a numeric identifier which is managed as a web parameter). The browser in this point gets the access to the local resources (camera and mic), operation which asks for permission (getUserMedia JavaScript part). It also creates a peer connection offer which is sent to the server (PeerConnection API part). This offer, I suppose, contains things like IPs, codecs, and all the things needed to establish a P2P connection.
The partner browser enters to the application specifying the same room (same parameter r=<ROOM_ID>) and performs the same operations (it accesses to local resources and sends another offer to the server). But this time the server finds that there is another user in the same room and both offers are exchanged.
At this point there is a PeerConnection negotiation to check how the communication between both browsers are going to be, taking the server as the man in the middle which lets both partners to interact (both partners know the server and can talk through it). Each offer is exchanged between the browsers and all the decisions needed to establish a P2P connection are agreed (I do not know for real what things are decided here). The application uses google channel for the communication between browsers and server.
After the negotiation both browsers can interact directly with each other and the video call (sound and video) is performed without any server interaction.
Finally, when any user leaves the page or hangs the call, the browser communicates again with the server in order to inform that this user has left the room (the server deletes the user from the room).
With this application I have finally understood what is the role of the server in WebRTC, it is the one that connects two particular clients/browsers (so concepts like presence that was commented in the previous entries about VoIP is now competency of the server application). Now I am going to comment the steps I performed to test the demo:
The application was checked out:
The demo works with the google app engine (python version), so I downloaded the SDK for linux and unzipped it.
$ unzip -q ~/Desktop/google_appengine_1.7.4.zip
The application was started but listening in all interfaces of my box:
$ google_appengine/dev_appserver.py --address=0.0.0.0 apprtc/
In order to test the application firefox nightly (FF21) and chrome beta (chrome 25) are said to be used. In theory chrome 23 (version 21 added getUserMedia API and the commented version 23 the PeerConnection one) and firefox 18 (that version added all WebRTC code in the mozilla browser) have the full WebRTC API but, it is clear, both implementations are very immature and they are evolving version by version (take in consideration that the standard itself is only a draft right now). The application has some tweaks in order to make the conversation possible.
So Chrome 25 (beta channel right now) and firefox 21 (nightly in these moments) were installed in my old laptop (the new one does not have a camera ) and in my desktop machine.
$ /opt/google/chrome/chrome --version Google Chrome 25.0.1364.68 $ ./firefox --version Mozilla Firefox 21.0a1
Launch chrome 25 in one box and access the application (http://192.168.1.131:8080 in my case). As soon as you enter the browser is redirected with a room identifier parameter (http://192.168.1.131:8080?r=<ROOM_ID>). The browser asks permission to access the camera and microphone and it waits for the other partner to join.
- Launch firefox 21 in the other box and access the application but using the same room id (put directly the url with the parameter, http://192.168.1.131:8080?r=<ROOM_ID>). Firefox asks for permissions too and as soon as it enters the communications starts.
$ svn checkout http://webrtc-samples.googlecode.com/svn/trunk/apprtc/
Firefox needs two options enabled in about:config page, media.navigator.enabled and media.peerconnection.enabled. In the downloaded version 21 only the second one was not set by default.
Here it is the video:
Today's entry is a little explanation of the new WebRTC project. As I commented before I did not understand it completely at the beginning and this entry dissects the new demo application just to review its basics. I think now it is clearer for me and I hope that I helped you to understand it too. Obviously WebRTC is emerging as a new HTML5 feature and (as in many of the previous features) dispute is assured among mayor players.
Web Wars forever!
Saturday, February 2. 2013
SPNEGO/Kerberos in JavaEE (PAC)
The previous entry of the blog deployed the SPNEGO filter in a glassfish v3 container performing SPNEGO/Kerberos transparent login in a Samba 4 Windows domain (it is really interesting how difficult is to replaced a common pair, Windows AD DC and .NET IIS application, outside the Microsoft world). But it worked, you could see how a browser properly configured in a Windows machine member silently logged in. But I also commented that an important part was missing: AD groups. Without them authorization in JavaEE (groups should be mapped to application roles) was impossible. I was investigating the issue and this new entry is the result.
The first thing I found was an interesting article signed by Jens Bo Friis which explains how Windows Kerberos implementation deals with group membership problem (take into account it is from 2005 which gives an idea of the little progress in this subject). It seems that Microsoft added a extension to the Kerberos token called Privilege Access Certificate (PAC) which, among other information, contains the group SIDs (Group Security IDentifier) for the logged user. So group membership information is already there, in the token itself (remember the SPNEGO negotiation explained in the previous entry). I prefer this solution to accessing to the AD directly, which should be the easy integration (after retrieving the user using Kerberos the filter would ask the LDAP/AD for getting the user groups -memberOf attribute-).
Following the PAC solution the problem now is retrieving that information from the token. The Kerberos token seems to be ASN.1 encoded and the part that contains the PAC is also encrypted. Searching quite deeply, I found that some classes in Spring project decoded the token for obtaining the SIDs. The Spring class uses another project called jaaslounge, which is another try for Windows/Kerberos/AD SSO integration. Looking the Spring class only the decoding package was needed for parsing, not the whole project. The jaaslounge library is very confused to download so I finally decided to get the source from SVN directly and only compile the needed package (org.jaaslounge.decoding). Finally the decoding classes import the ASN.1 parsing framework from Bouncy Castle.
In summary the SPNEGO filter (the project I used in the previous entry) is going to retrieved the SIDs from the token using jaaslounge decoding, which in turn uses the Bouncy Castle ASN1 implementation. Besides, to make matters worse, the jaaslounge parsing classes did not work (I suppose that there is some problem with recent versions of Bouncy Castle). Therefore I created a frankenjar Netbeans project which contains the spnego-r7 filter implementation with the new code, the modified jaaslounge decoding package and a utility class from Spring (PacUtility.java which decodes the SID from binary to text). It is really remarkable that, when I do odd things, there are very few like me . With that mess the PAC section can be parsed with a code similar to the following (it was added to the SpnegoAuthenticator.java class):
LOGGER.finer("Let's parse the kerberos token..."); sids = new ArrayList(); // create the parser SPNEGO token SpnegoInitToken spnegoToken = new SpnegoInitToken(gss); // retrieve the mech (Kerberos) token byte[] mechanismToken = spnegoToken.getMechanismToken(); // get the server credentials from the service account subject (keytab) Set creds = this.loginContext.getSubject() .getPrivateCredentials(KerberosKey.class); KerberosKey[] keys = creds.toArray(new KerberosKey[creds.size()]); // parse the kerberos token KerberosToken kerberosToken = new KerberosToken(mechanismToken, keys); // get all the autorizations List authorizations = kerberosToken.getTicket() .getEncData().getUserAuthorizations(); for (KerberosAuthData authorization : authorizations) { // get the PAC info PacLogonInfo logonInfo = ((KerberosPacAuthData) authorization) .getPac().getLogonInfo(); if (logonInfo != null) { if (logonInfo.getGroupSid() != null) { String sid = PacUtility.binarySidToStringSid( logonInfo.getGroupSid().toString()); LOGGER.log(Level.FINER, "Adding primary group SID: {0}", sid); sids.add(sid); } if (logonInfo.getGroupSids() != null) { for (PacSid pacSid : logonInfo.getGroupSids()) { String sid = PacUtility.binarySidToStringSid(pacSid.toString()); LOGGER.log(Level.FINER, "Adding secondary group SID: {0}", sid); sids.add(sid); } } if (logonInfo.getExtraSids() != null) { for (PacSid pacSid : logonInfo.getExtraSids()) { String sid = PacUtility.binarySidToStringSid(pacSid.toString()); LOGGER.log(Level.FINER, "Adding extra group SID: {0}", sid); sids.add(sid); } } if (logonInfo.getResourceGroupSids() != null) { for (PacSid pacSid : logonInfo.getResourceGroupSids()) { String sid = PacUtility.binarySidToStringSid(pacSid.toString()); LOGGER.log(Level.FINER, "Adding resource group SID: {0}", sid); sids.add(sid); } } } }
The parsing is only executed once, the first user access, in the following requests the SIDs are recovered from the Java session. The SIDs have been added to the SpnegoPrincipal.java and (with a proper cast) they can be got from it or using isUserInRole method (roles references cannot be retrieved, so the argument should be a SID).
// request the SIDs as a list List<String> sids = ((SpnegoPrincipal)request.getUserPrincipal()).getGroupSIDs(); // check if the user belogs to "Domain Users" (SID = 513) request.isUserInRole("S-1-5-21-2145774160-2038213957-1596523949-513");
I already know that all the solution is a shit but it should be considered as a simple PoC (better integration should be done in a real project). And more important, the solution is feasible, here it is my demonstration video (it just shows how the SIDs are obtained from the token in the first request and, in the second, they are just got from the session).
Well today's entry is a horrible messed JAR that lets the SPNEGO filter decode the PAC (AD Kerberos custom section) in order to add to the principal the list of SIDs (group identifiers) of the logged user. The JAR (the project can be downloaded in a previous link) is the SPNEGO project original JAR file with some additions, a modified jaaslounge decoding package, used for Kerberos token parsing, and a utility class from Spring Kerberos authentication, used to transform binary SIDs into strings. Please remember all these entries use Samba 4 as the DC. The solution is just a PoC, a proof that Jens' idea about Windows Kerberos security is possible. I know the whole implementation is (very) improvable but it is the best I could do with the little time I spent on it, I will try to get a more robust solution if I get time.
Sometimes you eat the bear... And sometimes the bear eats you.
Comments