BibSonomy Blog: linked data

Showing posts with label linked data. Show all posts

Sunday, December 8, 2013

Feature of the Week: Python Client for the BibSonomy REST-API

This week's feature touches several activities that are currently happening:

We are moving our development infrastructure to Bitbucket.
We are reanimating the Python client for the REST API.
We are developing a CKAN extension.

These activities are somehow related, at least I am bound to mention them all three in this post.

Moving to Bitbucket

As an important step to further open BibSonomy for other developers and ease the development of applications that use the BibSonomy infrastructure, we are migrating to Bitbucket until the next release (which is planned for end of January). At the moment, several extensions and plugins for PHP, Typo3, Android, etc. are moved to the new Bitbucket BibSonomy account. Other code will follow during the next weeks, in particular the releases currently available at dev.bibsonomy.org.

Python Client

The old Python client for the BibSonomy REST API was no longer maintained and was not really used (and thus not tested) by a real use-case. Since we now have the need for a Python client (see next section), we started the development of a new client.

As a first use-case to test the code and implement some nice functionality, we wrote the small script onefile.py which allows you to download all your posts from BibSonomy into one HTML file which you can use in offline-mode. This is handy for situations where you don't have an internet connection, in particular, since the script allows you to also download all your documents! The documentation which you can access with --help shows you what is possible:

usage: onefile.py [-h] [-u USER] [-t TAG [TAG ...]] [-d]

[--bookmark-file BFILE] [--publication-file BFILE]

[--css-file CSSFILE] [--no-bookmarks] [--no-publications]

[--test]

user apikey

Download posts from BibSonomy and store them in a file.

positional arguments:

user BibSonomy user name

apikey corresponding API key (get it from

http://www.bibsonomy.org/settings?selTab=1)

optional arguments:

-h, --help show this help message and exit

-u USER, --user USER return posts for USER instead of user

-t TAG [TAG ...], --tags TAG [TAG ...]

return posts that contain the given tags

-d, --documents download documents for publications

--bookmark-file BFILE

file name for bookmarks

--publication-file BFILE

file name for publications

--css-file CSSFILE write CSS to file

--no-bookmarks do not write bookmarks

--no-publications do not write publications

--test use test data

We are currently very active in improving the script, therefore feedback and suggestions are highly welcome.

CKAN Extension

CKAN is a web-based platform for scientists to manage and publish dataset metadata as Linked Open Data. To better connect datasets with the publications that describe and use them, we are currently implementing a CKAN extension that allows users to connect their datasets with the corresponding publications from BibSonomy. A mockup screenshot shows how we intend to integrate publications into CKAN:

We think that these are good news for all developers, since rapid development of BibSonomy-based applications with Python is now becoming much easier. Feedback and contributions are welcome - the source code is open and free for everybody.

Happy developing!

Friday, May 20, 2011

Feature of the Week: Improved Content-Negotiation Capabilities

Today I describe a rather technical feature of BibSonomy that is important for programmers and in particular for Semantic Web enthusiasts.
As I already mentioned in my post from EKAW 2010, we are constantly trying to improve BibSonomy's integration into the Linked Data Cloud. In the 2.0.14 release of BibSonomy a new content negotiation mechanism was implemented that augments our first implementation from 2007.

Instead of using the special /uri/ prefix, we now perform content negotiation on all URLs BibSonomy serves. Depending on the HTTP Accept header of your client, you are redirected (with the HTTP status code 303 See Other) to a view representing the requested resource(s) in a corresponding format. The following media types are currently supported:

application/rdf+xml: RDF output according to the BuRST specification. This is basically a combination of the RSS and SWRC schema to describe publication references. Hence, currently only the publication posts are returned.
application/json: All posts (bookmarks and publications) in JSON format. This is a lightweight JavaScript data structure.
text/csv: All posts (bookmarks and publications) as comma separated values (CSV).
text/x-bibtex: Publications in BibTeX format.

In particular the application/rdf+xml media type is important for BibSonomy's integration in the Linked Data Cloud. There are still some issues we have to fix, e.g., proper representation of author order or the vocabulary to represent tagging information (e.g., using commontag.org in favour of/addition to the RSS taxonomy module). Therefore, many thanks to Sebastian Tramp and Pascal Hitzler for there helpful comments.

Finally: Please note that content negotiation works for all URLs, even when it currently does not really make sense. E.g., when you request the /login page in application/rdf+xml, you get a redirect to /burst/login. Since this page does not exist, you get a 404 Not Found error.

Thursday, October 14, 2010

EKAW 2010 and New Ideas

This week I and my colleague Andreas are participating at the EKAW 2010 conference in Lisbon, Portugal. On Monday we had a nice tutorial on ontology learning from folksonomies and in the meanwhile heard some interesting talks.

Semantic Pingback

One talk, namely Weaving a Social Data Web with Semantic Pingback by colleagues from the Agile Knowledge Engineering and Semantic Web Group at the University of Leipzig, particularly raised our attention.

The basic idea of pingback comes from the blogosphere and allows blog authors to get noticed when somebody links to their posts. Tramp et al. extend this well known technique with Semantic Web technologies to allow the pingbacked server to gain more information from the referencing web page than just the fact that an article has been references. E.g., one could state that someone knows the author.

Probably you already got the idea we got: BibSonomy could implement (semantic) pingback and therefore notify authors that one of their web pages (or even scientific publications) has been bookmarked in BibSonomy. The technology behind that is relatively simple such that I think we can implement it in the next weeks. Since BibSonomy already supports RDF export (for bookmarks and publications), it is automatically semantic pingback enabled!

This will then work out of the box with many blog software and in the case of semantic pingback with OntoWiki - but in principle any HTTP server could support pingback. Thinking one step further, publishers of scientific articles could support pingback to get feedback about the popularity of articles. Therefore, we maybe implement pingback for publications, too (technically, it makes no difference for us).

Linked Data

One thing I also learned from the Leipzig guys is that our content negotiation implementation to support the linked data idea needs to be fixed. Currently, only /uri/ path prefixes support content negotiation but all pages should support it. The rationale behind introducing the /uri/ prefix (as described in an earlier blog post) was that some browsers send an accept header containing "text/xml" on first position and therefore users would get XML instead of HTML which was not so nice. We will solve this problem by returning XML (or RDF+XML) only, when the requesting client exclusively requests this data format. Otherwise, we will always return HTML.

RDF output

That our RDF export according to the SWRC ontology is not perfect I did already know. I used the possibility to meet some Semantic Web experts to find out some errors we can easily fix. E.g., linking a publication's PDF using the owl:sameAs property is too strong - we will use some property from the Dublin Core ontology to do this better.

New Features (From our Wishlist)

Adding all the above ideas to our feature list, I again realized that this list is always way too long. It contains a lot of cool features we would - if we could - implement immediately, but we just don't have the resources to do so. To let you know what we think would be cool, here a quick list (really only a small part of the whole list): OAuth, OpenSocial, API versioning, fulltext search on your uploaded PDFs and bookmarked web pages, a TeXlipse plugin, ... Feel free to add more using BibSonomy's issue tracker.

New Features (In the Pipeline)

Finally, I can say that we are currently working on two cool features which will be released soon.

You will have much more freedom to configure your CV page because we are integrating a wiki renderer which basically allows you to add almost any content to the page.

Furthermore, we will introduce gold standard publication posts, i.e., posts which can be edited by several users to finally constitute a complete set of metadata for an article. For example, have a look at this resource which looks different than other resources in BibSonomy and that can serve as a gold standard for posts users want to create to reference that resource.
Additionally, gold standard posts can contain links to the articles the paper cites:

Thus we can represent the citation graph in BibSonomy.

OK, this was a pretty long blog post but I hope you enjoyed getting some news about what's going on "behind the scenes".

Header