Saturday, May 24. 2014
Couchbase Manager for Glassfish: Version 0.5
As you know I am developing a toy project, called couchbase-manager, which is basically a session manager for glassfish application server (version 3 and 4) with the main feature of storing the sessions in a couchbase server. Some days ago the version 0.5 of the manager was released, and today's entry is dedicated to present the new features in the two last versions. (The version 0.4 was released during the upgrading of the blog and, although I prepared the entry, it was never published. I simply forgot about it and then, when I realized about my oversight, I thought it was too late.)
In general this new version 0.5 is much less attractive than the previous one, version 0.4 was a big change inside the manager. I will try to summarize all the the changes below.
The version 0.4 introduced a new important feature: external attributes. Now in the couchbase-manager there are situations when an attribute inside the session is managed as another object in the couchbase repository. Until this version the session was always managed as a whole (it was serialized, stored and de-serialized as a complete object). The main target of this idea is managing smaller sessions in general.
It is obvious that this feature is an improvement only in some circumstances and, during its development, it was tried to identify those situations. It is important to remember that the manager handles two different configurations (sticky and non-sticky) and this new feature affects them in a different way. The sticky configuration never reads the session from the external repository (as the manager can be sure it is the only one which manages its sessions, it can trust in the sessions which are in memory) and the session saving is always done in the background. Therefore, for this configuration, a lighter session is not very important. Nevertheless the non-sticky configuration always reads (and blocks) the session when a new request comes, so, if the session is smaller, it has a direct benefit. But, what happens if the application requests an externalized attribute? It is clear that the manager should read the attribute synchronously. That means that, in the worst case (all the externalized attributes are requested) this feature is a penalty (the same information is read but in several requests, adding the network, couchbase processing times for each operation). In summary the new feature is only an advantage if the attributes managed as external are rarely accessed and big. If all attributes are frequently requested by the application this feature is useless or, even worse, a penalty.
The externalized attribute values are never maintained inside the session, it does not matter the configuration used. In both configurations the real value is always deleted from the session (saving memory) and only the reference is maintained (this reference is the key to the object in couchbase). So when an attribute is externalized, its value is always removed from the session.
For achieving this feature the package es.rickyepoderi.couchbasemanager.couchbase, which manages interaction with couchbase, was vastly modified to perform bulk operations (the class BulkClientRequest executes several operations at the same time, the operations needed to manage the external attributes and the session in a single bulk request). The other big change was performed in the CouchbaseWrapperSession, the session now needs to track the attributes in order to know which of them are accessed rarely. In short, big attributes (attrMaxSize property, 10KB by default) are tracked, they are externalized if its usage goes below a specified percentage and they are re-integrated in the session is the usage goes above another limit (attrUsageCondition). The tracking is done through the UsageStats class.
Version 0.5 adds another good feature above the changes done in version 0.4. The previous externalization characteristic made that each attribute of the session was serialized and de-serialized independently (before the session was serialized and de-serialized as a whole object). This change makes possible to delay the moment in which an attribute is de-serialized to the time when it is requested by the application, if the attribute is not requested that time is saved. Besides it has a second benefit, if the attribute is not accessed by the application, the same serialized object (which have never been de-serialized) is still valid. So the first part is only important in the non-sticky configuration (in sticky the attribute value is already stored in the session and the manager does not need to read or de-serialize it) but the second is valid for both (the session saving is performed in both configurations, and the attributes which were not accessed can be directly saved without serializing them again).
Nonetheless this feature has some penalty in memory usage in the sticky setup. This configuration uses the idea that the same manager is going to always manage a specified session, for that reason the attribute values remain in the session for avoiding unnecessary reads. Now the values are maintained twice (the serialized byte array and the real object value). When an attribute is accessed by the application the serialized byte array is removed, but it is stored again as soon as it is calculated for the saving. In the non-sticky configuration this penalty does not happen, in this case the attribute values are always cleared from the session and they are re-read at the beginning of the request. When they are read, all of them are only stored as a serialized byte array. The ones that are accessed by the application are de-serialized. So in this configuration the attribute value is the serialized array or the real object, but never both at the same time. Here it is important to remember that externalized attributes are always removed from the session (it means that big / unused attributes are not duplicated in the sticky configuration).
The final new feature is something that it was completely forgotten in the previous versions. JavaEE provides some listeners for monitoring the session and the attributes life-cycle (when a session is created, destroyed, renamed or when an attribute is added, modified or deleted). Until version 0.5 those listeners were not taken into account, so the behavior with them was unknown.
Finally I remembered the existence of those listeners and I tested what happened with them when using the manager. There were problems only with one situation, the destruction of a session because of inactivity. If you remember the manager considered a session invalidated by inactivity using the expiration time in couchbase. If the object still existed in the repository it was valid, if it was expired and therefore it did not exist, it was invalid. That was a very good idea (at least I think that) but the problem was that the session was unavailable when it was expired and, in turn, the listeners receive an incomplete session (only in non-sticky configuration which does not maintain the attribute values).
Therefore a new property was added, extraInactiveInterval, which establishes a extra time in seconds to the expiration time applied to sessions in couchbase (180 seconds by default). During this extra time the loop that searches for expired sessions has time to detect the session as expired and to invalidate it normally, calling the listeners properly. So, since version 0.5, a session is expired checking times instead of session existence in couchbase. Obviously the session needs to be re-read (non-sticky) to be sure it is really expired. As in any other cluster manager there are special considerations when several instances are involved, please check this wiki page for more information.
After all these changes, new performance tests are going to be presented but, this time, there are changes. The previous performance tests executed requests for session attributes with options: 1x50, 4x50, 20x100 and 20x200 (number of attributes and size of each one, sessions of 50, 200, 2000 and 4000 bytes respectively). In order to test external attributes some tests which manages bigger attributes are needed. Besides a new command line option was added to the web services client application. Now there is a u option which specifies the number of attributes that are accessed in each request. A execution with u=1 means in each update operation only one attribute is requested by the application randomly, but if u=a (the number of attributes to modify is the same of the number of attributes created) all the attributes are modified. This command line option lets us modify the usage ratio of the attributes to force their externalization or not. From now on the tests performed are the following: 4x50-u1, 20x200-u1, 12x12000-u1 and 12x12000-u12. The first two tests are the same tests that were performed before (tests 2 and 4 of the previous versions). The other two are new ones, which use twelve attributes of 12000 bytes (total session size around 140K), one has an attribute usage ratio of 8% (u=1, that means that all attributes are going to be externalized in both configurations) and the other of 100% (u=12, the twelve attributes are always read and, therefore, no one is externalized). Other difference now is that my laptop is configured with the performance governor. I saw that sometimes the numbers varied too much and I checked that the difference was because the frequency set by the governor (I suppose that the load is not big enough to set to maximum frequency with the default ondemand governor in all the tests). The numbers for the four tests are presented below but, because the differences, I am not going to compare them with previous versions.
Starting with the creation operation, the numbers are very similar in all the tests and configurations (except the sticky test where all the external attributes remain integrated in the session, which is slower in the three operations, and I really do not know why). Times should be similar because, more or less, all situations need the same operations against couchbase.
In the update graphic we have some interesting effects. The sticky configuration is not so clearly better, the benefit of saving de-serializations is good for non-sticky configuration. In both configurations times are better if only one attribute is accessed by the application (u=1), in the case that all the attributes are requested (u=12) times are clearly worse. So the externalization feature is quite nice, managing smaller sessions is worthy, and despite of the cost of reading synchronously one attribute. I have a strange feeling with the sticky case with u=12, this case is the worst in all the three operations and I have no reason to explain why (in theory it should be a bit better than the non-sticky test).
Finally the delete operation presents another interesting result. The numbers for all the tests except the one that performs externalization are more or less the same in both configurations, but the test with externalization is remarkably worse. The reason is that, when deleting the session, all the session attributes are accessed to execute possible listeners, so, when they are externalized, all of them have to be read synchronously one by one. Those extra reads make this case almost double the time of the other cases.
The tests show that externalization is a very good feature for the manager. And I feel that in a typical application the externalization would be even worthier (in the test with u=1 all the attributes have the same probability of being accessed by the application, which is not common in the real life). As a final comment I want to say that the performance tests stressed the disk notoriously, there are a lot of sessions being created, deleted and modified and couchbase persists all these changes to disk. Several times I have said that my couchbase environment does not need disk persistence, I think that replication is enough for common JavaEE applications, but couchbase guys seem to be reluctant to provide such configuration. I have read several times that the software is moving to be a complete NoSQL database instead of a cache system. If it is finally true, it is a real pity, because I chose couchbase because it was a cache and not because it was a database. I feel that, with disk persistence imposed, this manager is never going to be fully functional, the best setup is unavailable.
Regards!
Saturday, May 10. 2014
Updating the old RDP portlet (Part I)
In a very old series of the blog a portlet was developed to manage Remote Desktop Protocol (RDP) applications inside a portal (specifically liferay). The series consisted in two complementary entries, the first one presented the solution using a windows client and the second post used a linux box. The portlet listed some windows applications and, when one of them was clicked, an RDP file was downloaded to be opened by a local RDP program. The windows entry worked perfectly (using the common Remote Desktop Connection or mstsc), but the linux solution was very limited because of the old rdesktop project. At that time it was not very active. Now it seems that it is getting traction again, releasing new versions and acquiring new features more frequently. If you re-check the previous entries some features were important to get a good integration of RDP applications inside a corporate portal.
RemoteApp: feature that displays the windows applications seamlessly inside the local desktop (instead of displaying the whole windows desktop only the selected application window is displayed integrated within the local desktop). This characteristic is basic to not scare the common user.
Gateway: a kind of proxy which wraps the RDP protocol over HTTPS, another basic feature which extends Terminal Services from the intranet to the internet. A necessary feature for avoiding a VPN in the final architecture.
RDP files: The last feature is managing RDP files, any RDP connection can be exported to a text file which can be edited, copied and distributed in an easy way. Indeed the portlet just downloads RDP files in order to be executed by a local application in the user computer.
The rdesktop program supports RemoteApp (but I am not sure if this is supported in a straight way) but it does not manage gateways or RDP files (same situation than in the old linux entry). Nevertheless some weeks ago I realized that there is a new project that deals with RDP: FreeRDP. The project is immature right now but it is improving quickly and, to my surprise, the last version 1.1 (in beta status) gives an initial support to gateway proxies and also understands RDP files. As soon as I notice this I decided to update my old portlet and try to make it work using FreeRDP in linux.
The first step was installing the windows Terminal Services server. I decided to use two 2012R2 servers, one is used as the internal box, session host and broker (win2012int), and the other would have been placed in the DMZ, acting as web access and gateway server (win2012ext). At this point the solution was tested using a windows 7 client from which I accessed the Remote Desktop Web application (another optional element of a TS deployment which exposes some previously configured applications inside an IIS application, it can be said that it is the Microsoft version of my little portlet).
Then the branch 1.1 of the FreeRDP was compiled (debian testing currently provides only version 1.0.2). I followed the instructions of the ifconfig.dk blog.
git clone -b stable-1.1 git://github.com/FreeRDP/FreeRDP.git cd FreeRDP/ cmake -DCMAKE_BUILD_TYPE=Debug -DWITH_SSE2=ON \ -DCMAKE_INSTALL_PREFIX=/home/ricky/apps/FreeRDP -DWITH_DEBUG_ALL=false make make install cd /home/ricky/apps/FreeRDP/bin ./xfreerdp /version This is FreeRDP version 1.1.0-beta1 (git 1.1.0-beta+2013071101-127-g01865)
I tested the resulting command and it worked with the gateway but it had a very nasty issue when using the RemoteApp feature. The window (or the pointer) was shifted (the mouse pointer in the windows application was shifted from the real one in the linux X server). I checked that this issue did not happen when using the gateway with the complete desktop and it did happen when using a RemoteApp without the gateway. So it seems that the problem is only related with the remote applications. I even tried with the master branch (1.2.0-beta1 (git 1.1.0-beta+2013071101-1641-g4da5c) version) and the results were mixed. The gateway did not work (it core dumped with a segmentation fault) and the RemoteApp (without the gateway) did not present the nasty shifted issue but it has problems refreshing the window when menus were popped out. Finally I decided to stay in version 1.1 because the gateway was a basic feature in the solution presented in this entry.
The options of the command are quite complicated (and there are some errors when they are combined with an RDP file), for this reason the following examples are listed.
Launching the complete desktop using normal port 3389 for RDP (not using the gateway):
./xfreerdp /d:DEMO /u:ricky /p:xxxxx \ /v:win2012int.demo.test /cert-ignore
Launching the desktop but using the gateway (the same user is used for the gateway http login and the desktop session):
./xfreerdp /d:DEMO /u:ricky /p:xxxxx /v:win2012int.demo.test \ /g:win2012ext.demo.test /cert-ignore
Launching the mspaint application (previously configured in the collection to be a valid remoteapp) without the gateway:
./xfreerdp -d:DEMO /u:ricky /p:xxxxx /app:"||mspaint" \ /v:win2012int.demo.test /cert-ignore
Launching the mspaint application with the gateway:
./xfreerdp -d:DEMO /u:ricky /p:xxxxx /app:"||mspaint" \ /v:win2012int.demo.test /g:win2012ext.demo.test /cert-ignore
Using an RDP file for mspaint (the file was directly downloaded from the RD web access application). The file does not contain any user or password and, I do not know why, the domain, user and password information should be passed twice, for the gateway and the session. It is like, when using a file, the information is not re-used for both. Besides the domain should be passed as the domain itself and as part of the username options (I think there are errors with those options when using an RDP file). All the rest of the information is read from the file (but there are also problems because, for example, clipboard redirection is not activated although the property redirectclipboard is set and FreeRDP supports it with the +clipboard option in the command line).
./xfreerdp /home/ricky/paint.rdp /d:DEMO /u:"DEMO\ricky" \ /gd:DEMO /gu:"DEMO\ricky" /p:xxxxx /gp:xxxxx /cert-ignore
FreeRDP project is definitely an immature software (at least the features I wanted to use, I did not tested typical intranet features like audio, video or device redirection). I was not able to combine the three desired features (remoteapp, gateway and rdp files) in a reasonable way. The version 1.1 (remember it is now in beta status) has a nasty shift issue and, although it supports RDP files, you still need something to request the domain, user and password and some properties are not parsed completely. I am going to present a little video which shows the last command, how the FreeRDP command is used for interpreting a RDP file obtained from the RD Web Access to launch the mspaint application. As you see the cursor position is shifted (even more when the windows is first displayed than when it is placed back to front).
Although the results are not very satisfactory a second part for this series is on the way. The FreeRDP needs some time to handle the three needed features in a proper way (it is mainly done, there are only minor problems although they are very annoying) but I am going to update my old portlet anyway. It will be nice to work again with liferay, cassandra and the portlet faces bridge. Despite all the previous comments it is great to see another project dealing with RDP in linux. FreeRDP is quickly being improved and there are a lot of contributors working on the project, so I hope all this annoying issues will be fixed before version 1.1 was final.
Stay tuned for the next entry!
Comments