Monday 6 May 2019

NCM Investigation

Story 1: Why the deployment didn't pick up my changes?

I created a new jvmctl node of NCM on Hammer. I changed the port number and specified the git branch, and the deployment was successful. However, the system didn't seem to pick up my changes.

After making sure all my code changes were indeed deployed on Hammer (by checking the /apps folder), initially I thought the JSP might be cached, but then I realised not only JSP changes were not picked up, but none of the other changes were effective either.

I suspected, albeit a long shot, that it could be an issue of environmental nature. Then I repeated the same steps on Hoist, only to prove my theory wrong – it had exactly the same symptom. I went so far as to rename start.jar to start2.jar and yet the server still managed to start successfully. How was that even possible?

Upon carefully inspecting the jcmctl node, I finally found the culprit.

APP_OPTS=-Denv=dev -Dconfig=/apps/ncm/jetty/ncm/webapps/ndp/ -Dndpqa=yes -Djetty.http.port=9900 -Dorg.eclipse.jetty.server.Request.maxFormContentSize=20000000 -Dorg.eclipse.jetty.util.FileResource.checkAliases=false -Djetty.home=/apps/ncm/jetty -Djetty.base=/apps/ncm/jetty/ncm -jar /apps/ncm/jetty/start.jar --lib=/apps/ncm/jetty/ncm/lib/nla.jar /apps/ncm/jetty/ncm/etc/jetty.xml

There are numerous references made to the folder /apps/ncm, while it should be /apps/

I did, more than ten times, have looked at this line of configuration. It had continued eluding me because the word 'ncm' looked so innocuous sitting there. The eureka moment only hit me when I was checking the contents under /apps/ncm/jetty, fully expecting to see start2.jar. Of course start2.jar wouldn't be there. The sight of start.jar made me mumble "of course, of course, that's it!" and I felt a great sense of satisfaction.

Story 2: Why changing JspWrapper has no effect?

Once I deployed my code changes on hammer, I noticed the changes made on JspWrapper.java had no effect. However, a new class added to the same package which JspWrapper belonged to was clearly functioning.

Considering JspWrapper was accessed in a less-than-usual fashion (by a JSP file), I wondered if the compiled JSP was cached, so I renamed JspWrapper to JspWrapper2, and changed other places accordingly. Yep, it worked. Then I changed it back to JspWrapper, much to my surprise, only to see it fail again.

I suspected there was a rogue JspWrapper class lurking somewhere. To prove it, I changed JspWrapper to JspWrapper2 again, only this time, without changing the places that referenced to it. And the system could still find this class.

Now how to locate this rogue class, which resided in one of jar files? I used this command:

find . -name *.jar

I went through the list of jars carefully, and one jar caught my eye: nla-backup.jar. What the hell was that? Oh, I remembered. Before I had made the changes to the infrastructure java classes, I backed up the old nla.jar by renaming it to nla-blackup.jar just in case something went wrong.

I had committed the jar into github, not unintentionally, because I thought storing the file on github was safer than keeping it locally. Obviously it backfired now. There were two JspWrapper classes (and many other duplicate classes) on classpath. Which one would be loaded by JVM was nondeterministic. In my local environment, it happened to be the newer class. However, on hammer, it was the old one that got loaded.

No comments:

Post a Comment