Friday, March 27, 2009

FOW: You better save this as publication...

Since Andreas' blog post you already know BibSonomy's scraping service which enables BibSonomy to automatically extract publication meta data from some websites.

For some time past this service also checks whether a user posts a bookmark pointing to a scrapable site, in which case the user is given an according hint:
We now extended this feature such that text, selected by the user prior hitting the 'postBookmark' button, is also checked. Thus, if you select BibTeX and press "postBookmark" or a DOI it might happen, that you get a notice, that you better save this as publication and not as bookmark.

You might try it out yourself right now by selecting one of these text snippets and pressing "postBookmark":
  • 10.1007/978-3-540-73681-3_21
  • ISBN-13: 978-0201485417

Wednesday, March 25, 2009

CiteSmart

Last year in November we were contacted by Mounir Errami, the project leader of MireSoft. He suggested a partnership with us. MireSoft has a product called CiteSmart, which is a citation software. It nicely integrates with Word and builds a bridge between web-based tools like BibSonomy or Connotea. It is easing the way to takeover the data from web application and it is able to produces references in various formats for articles written in Word. In this way, it allows to easily write scientific articles and supports the scientific work of researchers. Here is a screenshot:



We support this partnership as we think that BibSonomy needs to be connected with as many tools as possible. This broadens its community and makes BibSonomy more valuable for its users. To conclude: If you are using BibSonomy and you are searching for an easy way to work with Word, then we can definitely recommend CiteSmart.

Friday, March 20, 2009

New Release

We've just released a new version of BibSonomy which fixes several bugs and introduces new features.

A short overview on the fixed bugs:
  • The restriction to a certain user or certain years on author pages was broken, i.e., /publ/author/Jäschke/sys:year:2008+sys:user:jaeschke delivered no results. This works again, now.
    Please note that to get the page sorted by year you will need to add an additional parameter "sortPage=year" and if you want to have the latest articles on top additionally the parameter "sortPageOrder=desc". This is a new feature which allows to sort the posts on a page on (almost) any BibTeX fields. You can even combine fields using "|" as delimiter, such that posts having the same year are sorted by author name.
  • The BibTeX field "abstract" was exported as "bibtexAbstract" in some cases. This caused some tools which use this field to break. Now the abstract is again contained in the field "abstract". Note: in the API XML this field is still named "bibtexAbstract". We will see how we can fix this in the future (though it's not really a bug there).


New features include:

Some of the changes improve working with BibSonomy a lot and we will continue to transfer pages to the new backend. If you have found bugs or suggestions regarding the new pages, please let us know!
We will introduce the new features in more detail with the upcoming features of the week.

Friday, March 13, 2009

FOW: Fighting against the memory leak

Todays feature of the week is a bit more technical. As you might know BibSonomy is based on a MySQL/Tomcat architecture. Usually BibSonomy is running very stable but from time to time the Java virtual machine stops with an "java.lang.OutOfMemoryError: PermGen space" error. This mostly happens after a redeploy of the BibSonomy project on the Tomcat. Why does this happens? The simple answer is: Because the Java VM does not have enough memory for the so called permanent generation space. This space is used to hold the Java classes in main memory. A simple solution is to give the JVM more PermGen space. But this does not solve the underlying problem. Usually the JVM has enough PermGen space. The only result from giving more memory is: the error will happen a bit later and not directly after the redeploy.

So we decided to search for the cause of the memory leak. Soon we found out, that there were some classes from the web application which the classloader could not remove from the PermGen space because they were "linked" to classes which were loaded by the standard classloader. There could be several causes for that and using the right tools (jmap and jhat from the JDK) plus some small programm to find reference chains we found the culprits:

* MySQL Connector/J (see http://bugs.mysql.com/bug.php?id=36565)
* iBatis (see https://issues.apache.org/jira/browse/IBATIS-540)
* JabRef
* Tomcat (see https://issues.apache.org/bugzilla/show_bug.cgi?id=46221)
* and some we could fix by just moving some JARs to the right places (see also here and here).

Identifying the subjects was an iterative task - fixing one leak caused appearing the next one ... We did not know that there were so many candidates at the beginning. We could fix iBatis by switching to a newer version, MySQL, JabRef and Tomcat were a little harder to fix.
For JabRef we had to modify the source code such that it does not start AWT. Additionally, a Tomcat LifecycleListener kills the java.util.prefs.FileSystemPreferences after webapp shutdown using awful Java introspection hacks:

final Class clazz = CleanupListener.class.getClassLoader().loadClass("java.util.prefs.FileSystemPreferences");
final Field f = clazz.getDeclaredField("syncTimer");
f.setAccessible(true);
final Timer timer = (Timer) f.get(null);
timer.cancel();

To fix the MySQL bug, the listener ensures on the startup of the web application that the MySQL connection class is loaded before the web app and by the standard classloader, such that the cancellation timer threads (which is the cause of the leak) don't block unloading of the webapp. The loggers from the StandardContext in Tomcat (which are loaded via the webapps classloader - for whatever reason) are also killed by the listener.

After several weeks of work we have a leak free application. The bad thing is that every library we are using can bring back a leak and if we are not careful the leak will be back quicker as we like. Unfortunately we are not aware of a method which we could put into the Tomcat or into our application which just checks for memory leaks.

Hope you found this interesting and good luck with your own applications ...