Generating SIOC data out of mailing list archives

SIOC provides “an ontology for describing discussion forums and posts on topic threads in online community sites” and so called exporters for WordPress or Drupal blogging/cms software. Using that one is perfectly suited to provide information about the activities of communities on websites. A exporter for mailing lists (which are heavily used by communities too) is not provided. I will outline how such a SIOC mailinglist exporter may work…

I like to reuse, so that’s the first policy.

The second policy is, that I dont want to deal with the email archives, but will use RSS or Atom representation of the archive as the source of information.

SIOC mailinglist exporter architecture

How to understand the information provided by mailing list archives in a SIOC context?

First of all, the mailing list archive is obviously a sioc:Forum (archive itself represented by a RSS or Atom feed) with sioc:Posts (individual mails, RSS/Atom items). The sender of the mail is a sioc:User, which is the sioc:has_creator of the sioc:Post. The website hosting the mailing list archive is defined as the sioc:Site, the initialy mentioned sioc:Forum sioc:has_host this sioc:Site. Got it?

As a test subject I took the SIOC-dev mailing list at Google Groups, development is supported by RDFReactor.

Results of my work can be found 1. as Java code and 2. as the resultung [SIOC data][6].

Comments and any other feedback is welcome!