Resource Description Framework


The Resource Description Framework is a method to describe and exchange graph data. It was originally designed as a data model for metadata by the World Wide Web Consortium. It provides a variety of syntax notations and formats, of which the most widely used is Turtle.
RDF is a directed graph composed of triple statements. An RDF graph statement is represented by: a node for the subject, an arc from subject to object, representing a predicate, and a node for the object. Each of these parts can be identified by a Internationalized Resource Identifier. An object can also be a literal value. This simple, flexible data model has a lot of expressive power to represent complex situations, relationships, and other things of interest, while also being appropriately abstract.
RDF was adopted as a W3C recommendation in 1999. The RDF 1.0 specification was published in 2004, and the RDF 1.1 specification in 2014. SPARQL is a standard query language for RDF graphs. RDF Schema, Web Ontology Language and SHACL are ontology languages that are used to describe RDF data.

Overview

The RDF data model is similar to classical conceptual modeling approaches. It is based on the idea of making statements about resources in expressions of the form subjectpredicateobject, known as triples. The subject denotes the resource; the predicate denotes traits or aspects of the resource, and expresses a relationship between the subject and the object.
For example, one way to represent the notion "The sky has the color blue" in RDF is as the triple: a subject denoting "the sky", a predicate denoting "has the color", and an object denoting "blue". Therefore, RDF uses subject instead of object in contrast to the typical approach of an entity–attribute–value model in object-oriented design: entity, attribute, and value.
RDF is an abstract model with several serialization formats. In addition the particular encoding for resources or triples can vary from format to format.
This mechanism for describing resources is a major component in the W3C's Semantic Web activity: an evolutionary stage of the World Wide Web in which automated software can store, exchange, and use machine-readable information distributed throughout the Web, in turn enabling users to deal with the information with greater efficiency and certainty. RDF's simple data model and ability to model disparate, abstract concepts has also led to its increasing use in knowledge management applications unrelated to Semantic Web activity.
A collection of RDF statements intrinsically represents a labeled, directed multigraph. This makes an RDF data model better suited to certain kinds of knowledge representation than other relational or ontological models.
As RDFS, OWL and SHACL demonstrate, one can build additional ontology languages upon RDF.

History

The initial RDF design, intended to "build a vendor-neutral and operating system- independent system of metadata", derived from the W3C's Platform for Internet Content Selection, an early web content labelling system, but the project was also shaped by ideas from Dublin Core, and from the Meta Content Framework, which had been developed during 1995 to 1997 by Ramanathan V. Guha at Apple and Tim Bray at Netscape.
A first public draft of RDF appeared in October 1997, issued by a W3C working group that included representatives from IBM, Microsoft, Netscape, Nokia, Reuters, SoftQuad, and the University of Michigan.
In 1999, the W3C published the first recommended RDF specification, the Model and Syntax Specification. This described RDF's data model and an XML serialization.
Two persistent misunderstandings about RDF developed at this time: firstly, due to the MCF influence and the RDF "Resource Description" initialism, the idea that RDF was specifically for use in representing metadata; secondly that RDF was an XML format rather than a data model, and only the RDF/XML serialisation being XML-based. RDF saw little take-up in this period, but there was significant work done in Bristol, around ILRT at Bristol University and HP Labs, and in Boston at MIT. RSS 1.0 and FOAF became exemplar applications for RDF in this period.
The recommendation of 1999 was replaced in 2004 by a set of six specifications: "The RDF Primer", "RDF Concepts and Abstract", "RDF/XML Syntax Specification ", "RDF Semantics", "RDF Vocabulary Description Language 1.0", and "The RDF Test Cases".
This series was superseded in 2014 by the following six "RDF 1.1" documents: "RDF 1.1 Primer", "RDF 1.1 Concepts and Abstract Syntax", "RDF 1.1 XML Syntax", "RDF 1.1 Semantics", "RDF Schema 1.1", and "RDF 1.1 Test Cases".

RDF topics

Vocabulary

The vocabulary defined by the RDF specification is as follows:

Classes

rdf
; : the class of XML literal values
; : the class of properties
; : the class of RDF statements
;,, : containers of alternatives, unordered containers, and ordered containers
; : the class of RDF Lists
; : an instance of rdf:List representing the empty list
rdfs
; : the class resource, everything
; : the class of literal values, e.g. strings and integers
; : the class of classes
; : the class of RDF datatypes
; : the class of RDF containers
; : the class of container membership properties, rdf:_1, rdf:_2,..., all of which are sub-properties of rdfs:member

Properties

rdf
; : an instance of rdf:Property used to state that a resource is an instance of a class
; : the first item in the subject RDF list
; : the rest of the subject RDF list after rdf:first
; : idiomatic property used for structured values
; : the subject of the RDF statement
; : the predicate of the RDF statement
; : the object of the RDF statement
rdf:Statement, rdf:subject, rdf:predicate, rdf:object are used for reification.
rdfs
; : the subject is a subclass of a class
; : the subject is a subproperty of a property
; : a domain of the subject property
; : a range of the subject property
; : a human-readable name for the subject
; : a description of the subject resource
; : a member of the subject resource
; : further information about the subject resource
; : the definition of the subject resource
This vocabulary is used as a foundation for RDF Schema, where it is extended.

Serialization formats

Several common serialization formats are in use, including:
  • Turtle, a compact, human-friendly format.
  • TriG, an extension of Turtle to datasets.
  • N-Triples, a very simple, easy-to-parse, line-based format that is not as compact as Turtle.
  • N-Quads, a superset of N-Triples, for serializing multiple RDF graphs.
  • JSON-LD, a JSON-based serialization.
  • N3 or Notation3, a non-standard serialization that is very similar to Turtle, but has some additional features, such as the ability to define inference rules.
  • RDF/XML, an XML-based syntax that was the first standard format for serializing RDF.
  • RDF/JSON, an alternative syntax for expressing RDF triples using a simple JSON notation.
RDF/XML is sometimes misleadingly called simply RDF because it was introduced among the other W3C specifications defining RDF and it was historically the first W3C standard RDF serialization format. However, it is important to distinguish the RDF/XML format from the abstract RDF model itself. Although the RDF/XML format is still in use, other RDF serializations are now preferred by many RDF users, both because they are more human-friendly, and because some RDF graphs are not representable in RDF/XML due to restrictions on the syntax of XML QNames.
With a little effort, virtually any arbitrary XML may also be interpreted as RDF using GRDDL, Gleaning Resource Descriptions from Dialects of Languages.
RDF triples may be stored in a type of database called a triplestore.

Resource identification

The subject of an RDF statement is either a uniform resource identifier or a blank node, both of which denote resources. Resources indicated by blank nodes are called anonymous resources. They are not directly identifiable from the RDF statement. The predicate is a URI which also indicates a resource, representing a relationship. The object is a URI, blank node or a Unicode string literal.
As of RDF 1.1 resources are identified by Internationalized Resource Identifiers ; IRI are a generalization of URI.
In Semantic Web applications, and in relatively popular applications of RDF like RSS and FOAF, resources tend to be represented by URIs that intentionally denote, and can be used to access, actual data on the World Wide Web. But RDF, in general, is not limited to the description of Internet-based resources. In fact, the URI that names a resource does not have to be dereferenceable at all. For example, a URI that begins with "http:" and is used as the subject of an RDF statement does not necessarily have to represent a resource that is accessible via HTTP, nor does it need to represent a tangible, network-accessible resource—such a URI could represent absolutely anything. However, there is broad agreement that a bare URI which returns a 300-level coded response when used in an HTTP GET request should be treated as denoting the internet resource that it succeeds in accessing.
Therefore, producers and consumers of RDF statements must agree on the semantics of resource identifiers. Such agreement is not inherent to RDF itself, although there are some controlled vocabularies in common use, such as Dublin Core Metadata, which is partially mapped to a URI space for use in RDF. The intent of publishing RDF-based ontologies on the Web is often to establish, or circumscribe, the intended meanings of the resource identifiers used to express data in RDF. For example, the URI:
is intended by its owners to refer to the class of all Merlot red wines by vintner, a definition which is expressed by the OWL ontology—itself an RDF document—in which it occurs. Without careful analysis of the definition, one might erroneously conclude that an instance of the above URI was something physical, instead of a type of wine.
Note that this is not a 'bare' resource identifier, but is rather a URI reference, containing the '#' character and ending with a fragment identifier.