Sunday, March 16. 2014
Investigating the EclipseLink Problem
Continuing the investigation of the EclipseLink problem commented in the blog two posts ago a new entry is presented. The post is mainly motivated because I have seen no progress at all neither in the public eclipse bug system nor in the internal request. Throughout the present entry the eclipselink library will be built and tested, and the the problem will be studied in order to try to understand it and present a fix.
The building process is explained in the project wiki page and it is not very complicated:
The git branch for last version 2.5 is cloned (current version is 2.5.2 which is still not released):
git clone -b 2.5 git://git.eclipse.org/gitroot/eclipselink/eclipselink.runtime.git
Once the source is available the compiling process is performed:
cd eclipselink.runtime export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64 export M2_HOME=/usr/share/maven/ ant -f antbuild.xml
Now the bundle can be tested. In the wiki page it is commented that this should be done against MySQL or Oracle DDBB, so I decided to install MySQL in my debian box:
apt-get install mysql-server
And then a database and a user for the testing were created:
create database test; create user 'test'@'localhost' IDENTIFIED BY 'test'; GRANT ALL PRIVILEGES ON test.* TO 'test'@'localhost' WITH GRANT OPTION;
A build.properties file is needed to perform the testing. In this file the following database connection properties were defined:
extensions.depend.dir=${user.home}/extension.lib.external oracle.extensions.depend.dir=${extensions.depend.dir} junit.lib=/usr/share/maven-repo/junit/junit/4.11/junit-4.11.jar tools.lib=${env.JAVA_HOME}/lib jdbc.driver.jar=/home/ricky/mysql-connector-java-5.1.29-bin.jar db.driver=com.mysql.jdbc.Driver db.url=jdbc:mysql://localhost/test db.user=test db.pwd=test db.platform=org.eclipse.persistence.platform.database.MySQLPlatform
Finally the testing can be started (following wiki instructions I decided to perform test-lrg and test-jpa targets).
ant -f antbuild.xml test-lrg ant -f antbuild.xml test-jpa
The different tests produce an HTML page with the results. Here are the numbers for the default compilation with no modifications done for my specific problem.
Tests Failures Errors Skipped Success Rate Time eclipselink.jpa.wdf.test 1014 0 0 165 100.00% 233.650 eclipselink.jpa.test 3157 0 0 0 100.00% 2780.802 eclipselink.core.test 7541 0 199 0 97.36% 1548.325 eclipselink.sdo.test 4613 0 0 0 100.00% 205.892 eclipselink.moxy.test/oxm 31846 0 0 0 100.00% 364.520 eclipselink.moxy.test/jaxb 20707 0 0 0 100.00% 80.117 The only errors are related to a missing descriptor class and I did not waste any time trying to fix it. The error exception was the following: Exception Description: Fatal query exception occurred. Internal Exception: Exception [EclipseLink-6007] (Eclipse Persistence Services - 2.5.2.qualifier): org.eclipse.persistence.exceptions.QueryException Exception Description: Missing descriptor for [class org.eclipse.persistence.testing.models.employee.domain.Employee]. Query: ReadAllQuery(referenceClass=Employee ).
Once the eclipselink.jar library is compiled and ready, I started to check my problem. First I decided to read the JPA 2.1 specification in order to guess if the test-case is indeed covered by the standard (remember that the test-case reads an entity object from the DDBB with one manager, and then, after closing that manager and using this object as a template, it changes the ID of the object and merges it inside another manager).
First, in the chapter 3.2.7, a detached entity is defined as the following:
A detached entity results from transaction commit if a transaction-scoped persistence context is used; from transaction rollback; from detaching the entity from the persistence context; from clearing the persistence context; from closing an entity manager; or from serializing an entity or otherwise passing an entity by value—e.g., to a separate application tier, through a remote interface, etc.
Therefore, in the merging last part of the test-case, the entity object is clearly detached for the new manager (the previous manager was closed). This was more or less clear because if you programmatically check for that, the object was declared as detached (the EntityManager contains method returns false for the entity). The second thing to check was assuring that the merge method should work for a detached entity, and here it is what the JPA specification says about this subject in the chapter 3.2.7.1:
The semantics of the merge operation applied to an entity X are as follows:
- If X is a detached entity, the state of X is copied onto a pre-existing managed entity instance X' of the same identity or a new managed copy X' of X is created.
- ...
So the object in our example should be copied into another object and, because the object is not managed before by this manager, a new managed entity should be read from the DDBB. For me this part was also clear, this should be supported cos the test-case works with no problem de-configuring weaving in eclipselink. In summary, the test-case is valid and it should work in any JPA implementation.
Digging deeper inside the problem, I think that I finally understood what is happening. Weaving modifies the entity class in order to add some helper methods that accelerate typical JPA processes. The internal weaving, among other possible things, modifies the entity class to implement the PersistenceEntity interface. That interface provides some helper methods to cache the primary key of the entity (when an entity object is read from the DDBB, the PK is cached inside the entity object itself), those methods help the JPA implementation to easily and quickly retrieve the PK of the entity (avoiding reflection or any other expensive technique). And mainly this is the problem. In the merge part the entity, although detached, has the PK filled with the original value and this fact makes the merge method to fail. The stack of calls when the PK is retrieved from the object in the last problematic merge is the following:
at org.eclipse.persistence.internal.descriptors.ObjectBuilder.extractPrimaryKeyFromObject (ObjectBuilder.java:2986) at org.eclipse.persistence.internal.sessions.MergeManager.registerObjectForMergeCloneIntoWorkingCopy (MergeManager.java:1047) at org.eclipse.persistence.internal.sessions.MergeManager.mergeChangesOfCloneIntoWorkingCopy (MergeManager.java:557) at org.eclipse.persistence.internal.sessions.MergeManager.mergeChanges (MergeManager.java:313) at org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.mergeCloneWithReferences (UnitOfWorkImpl.java:3521) at org.eclipse.persistence.internal.sessions.RepeatableWriteUnitOfWork.mergeCloneWithReferences (RepeatableWriteUnitOfWork.java:384) at org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.mergeCloneWithReferences (UnitOfWorkImpl.java:3481) at org.eclipse.persistence.internal.jpa.EntityManagerImpl.mergeInterna l(EntityManagerImpl.java:542) at org.eclipse.persistence.internal.jpa.EntityManagerImpl.merge (EntityManagerImpl.java:519) at es.rickyepoderi.weaving.Test.prueba (Test.java:75) at es.rickyepoderi.weaving.Test.main (Test.java:86)
The MergeManager.mergeChangesOfCloneIntoWorkingCopy tries to perform exactly what the specification says (tries to locate a managed entity X' for the entity passed X) but when it gets the PK of the entity the ObjectBuilder returns the cached PK of the detached object. Here I think it is the failure, the original PK is returned and not the new one, and finally the exception is thrown because the implementation tries to update the old object instead of creating a new one with the new PK. In my humble opinion, the eclipselink implementation should check if the object is managed before trusting in the cached PK. If the object is detached the cached PK can be invalid and it should be obtained using the common entity properties. It is clear that if no internal weaving is performed, the real PK is always returned, because the PK is never cached, and the problem is avoided.
Moreover there is not an easy way to fix the ObjectBuilder. The UnitOfWork has a isObjectRegistered method to know if the object is managed or detached inside the unit but ObjectBuilder only receives an Abstractsession (UnitOfWorkImpl is a child class of AbstractSession) and this method is not exposed in this class, therefore, there is no clear way to know if the object is managed or detached inside the ObjectBuilder. Finally I decided to do a cheat just to check if I was right, I did the following straight change:
diff --git a/foundation/org.eclipse.persistence.core/src/org/eclipse/persistence/internal/descriptors/ObjectBuilder.java b/ index 04a3283..1557667 100644 --- a/foundation/org.eclipse.persistence.core/src/org/eclipse/persistence/internal/descriptors/ObjectBuilder.java +++ b/foundation/org.eclipse.persistence.core/src/org/eclipse/persistence/internal/descriptors/ObjectBuilder.java @@ -63,6 +63,7 @@ import org.eclipse.persistence.sessions.remote.*; import org.eclipse.persistence.sessions.CopyGroup; import org.eclipse.persistence.sessions.SessionProfiler; import org.eclipse.persistence.sessions.DatabaseRecord; +import org.eclipse.persistence.sessions.UnitOfWork; /* * * <p><b>Purpose</b>: Object builder is one of the behavior class attached to descriptor. @@ -2984,7 +2985,8 @@ public class ObjectBuilder extends CoreObjectBuilder<AbstractRecord, AbstractSes boolean isPersistenceEntity = (domainObject instanceof PersistenceEntity) && (!isXMLObjectBuilder()); if (isPersistenceEntity) { Object primaryKey = ((PersistenceEntity)domainObject)._persistence_getId(); - if (primaryKey != null) { + if (primaryKey != null && + (!(session instanceof UnitOfWork) || ((UnitOfWork)session).isObjectRegistered(domainObject))) { return primaryKey; } }
It is quite weird, but if the AbstractSession is a UnitOfWork the object is checked to know if it is managed or detached and, if it is detached, the internal PK is not returned, normal processing is performed and the real ID is returned as the PK. Doing this little change the simple test-case works and I re-run the tests that I performed with the out of the box bundle. Same results were obtained (so my change seems to be harmless, the patch is only a performance penalty when the object is detached, the PK is recalculated, but I think this is not a common situation).
I think that the problem is understood, but I am not absolutely sure because the complexity of the project is quite remarkable and it could have some collateral effects difficult to foresee. (That is why I re-ran the tests, to check if something was broken after my change. Hopefully everything worked as before but you never know for sure.) The presented fix is not valid, a tidier solution should be done, my fix was just a proof to evaluate if my idea worked. I tried to explain this into the bug, but unfortunately there is no response for the moment.
Keep on trying!
Comments