Metadata and Resources

Berlin, 2 January 2001

Resources 

I have read John McClures´s Note on XML and RDF Schemas (http://www.legalxml.org/DocumentRepository/UnofficialNotes/Clear/UN_10005_1999_11_01.htm) and a document on Dublin Core and RDF of Miller and others (http://www.ukoln.ac.uk/metadata/resources/dc/datamodel/WD-dc-rdf/) of 1 July 1999. I reread the W3C RDF Recommendation 0f 22 Febrary 1999(http://www.w3.org/TR/REC-rdf-syntax)and the candidate recommendation on RDF Schema (http://www.w3.org/TR/2000/CR-rdf-schema-20000327/)of 27 march 2000.  

Metadata 

RDF, Resource Description Framework is a standard for capturing metadata. So with RDF in principle we talk about metadata only, not about markup within documents. I use metadata here in the sense of data which describe a document; data which are in principle "outside" the document. Some metadata may, however, also be found within the document (for instance the date of the document). 

Comparison DTD XML-Schema RDF 

What is the basic difference between DTD, XML-Schema or RDF(Schema) XML well formed (without DTD). From my perspective the difference is the level of complexity in which the relationship between elements can be described. Marking up the text in a document is describing the structure of the document. Here we are talking one level of abstraction higher: not the stucture of particular documents is what interests us, but rather the structure of the elementstructure. 

A well formed XML document without a DTD can have any number of elementnames in any order, appearing with any kind of frequence. There is no description of the relationship between the elements, other than the order in which they appear and some local hierarchy through local nesting. 

In a DTD a hierarchal relationship, tree, can be described. 

XML Schema provides for a possibility of (single) inheritance of properties of elements: classes and subclasses are introduced. 

RDF provides a framework for the description of resources, which are expressible in a URI. Resources are, where possible, described through their relationship (called property) with other resources. Such other resource may be as small as one element(name). RDF can be used to describe the relationship between elements. A Property can be a resource itself. RDF Schema provides for the definition of classes and subclasses. A subclass may inherit from more than one class (multiple inheritance). Properties may form a class as well. Resources may be instances of one or more classes. RDF instances may be described by using multiple schemata from multiple resources. The relationship between the various schemata (mapping) can be captured in RDF. 

RDF seems to be a good choice to describe metadata. RDF is a tool to describe relationships between elementnames to a higher level of complexity than XML well formed, DTD or XML Schema. In particular the fact that RDF provides for an easy way to create a metadata structure using metadatadictionaries, which themselves have been built independently from one another, is an attractive feature. 

Why structure? 

One step back: why do we need structure? This question is implicit in the occasional question: do we need valid xml documents? I call this the paradox of structure: we don't want structure and at the same time we do want structure. We don´t want structure as it limits us in describing reality. We do want structure, otherwise we do not know where to start and once we have described reality we need structure to find back what we have described. The technical point of view is no doubt to have as much structure as possible, as it increases computer performance. 

The imperfection of structure 

The moment an idea is put to words, it is captured within a strucure and by definition reduced to imperfection. Any further structure defined, like an XML-DTD or Schema, will impose further limits on the object for which it was designed. The goal is therefor not so much to make a perfect structure (as it is impossible), but to be conscious of the limits of that structure. One may design many different structures. Once describing information, one chooses that structure which best captures that particular reality. The other side of the coin is, that the work, which has to be done to capture that reality, should not be improportionate to the benefits which one will derive from the effort. For describing the content of a document, a DTD may very well suffice. Defining an RDF Schema for that purpose alone, may be an overkill. For describing metadata one needs more complex structures. Using RDF becomes appropriate.  Mapping 

Comparing structures, mapping DTD´s and Schemas and other stuctures, to another, will in my view become of central importance. In John´s Note I found in 3.9. the following remark: "Certainly if ... a mapping capability between DTD´s is required, the inheritance via RDF Schema is the right - and only - choice." I tend to agree with this statement, based on the above analysis of RDF. I also refer to 4.1.3. Inter-Vocabulary Relationships in the RDFS candidate recommendation. Now, to get down to business I propose: 

An experiment 

I would like to invite those interested to take part in a little experiment. For this experiment I definitely need John McClure, as he seems to have the most experience with RDF. His DCN Dictionary (http://www.dataconsortium.org/namespace/DCD100.xml), written in RDF, is impressive and inspires me to further follow the RDF path. (By the way: John, do you have available an explanation of the structure of the DC Dictionary in a more abstract way (perhaps a graphic) to show the various relationships between the 14 elements, types, properties, classes?) 

Also, I hope John Joergensen will join in this experiment, as thesauri and taxonomies belong to his profession as a librarian. 

The experiment is the following: let us decribe in RDF the quintessential document in the legal world, which is known by many different words: judgement, opinion, verdict, Urteil, Vollstreckungsbescheid, vonnis, arrest. I mean the decision rendered by a court after two or more parties have presented their arguments; a decision which is enforceable on the loosing party (hereinafter "judgement").  

This description already shows, how difficult it is to describe for XML purposes in a universal way something so essential; something which in common language would rarely pose any difficulties. To overcome the inconvenience that the same concept is expressed through different words and sometimes the same word descibes different concepts in different jurisdictions, I propose to list the various aspects of a judgement. If we would have 10 aspects 1-10, by example 1-7 would apply in California, 4-10 in Lousiana, 1,3,4,7,8,9 in the UK, 2,3,6,7,8,10 in Germany. To describe in an RDF Schema a particular kind of judgement in a particular jurisdiction, one would be able to refer everytime to one single namespace, picking out those aspects which describe and apply to that kind of judgement. 

  1. a decision
  2. public
  3. a reasoning (is contained, which explains why the particular decision was made) (reasoning could be further specified, like: based on statute, case law or other, restricted by the claim etc.)
  4. in writing
  5. proceedings (between two or more parties have taken place) (proceedings could be torn apart in further aspects)
  6. enforceable
  7. appealable
  8. binding (on the parties)
  9. public issuing body
  10. judicative (in the sense of not executive or legislative) issuing body

I would like these ten aspects to be described in RDF dictionary and have various people describe by reference to this "Judgement Namespace" one ore more kinds of judgements in their jurisdiction. 

Who wants to take up with me this challenge? 

Murk Muller
Lexml