General concepts for the sweety cataloger

 

Version

Comments

1

Author : Jean Christophe Desconnets (jcd@teledetection.fr

Created at : 19/06/2007 22:31:00

Modified at : 20/02/2008 17:10:00

Translation : Kim Agrawal (kim@auromail.net)

Comments on English version : english screen shot entry

 

 

What is MDweb? 2

Concept of roles and associated functionalities 4

Concept of template or metadata profile. 5

Concept of resources 6

Concept of metadata sheets or references 8

Concept of data-entry levels of a metadata sheet 8

 

MDweb is a generic, multi-language, multi-standard tool for cataloging and locating data and documents (spatial and non-spatial). It has been designed to build, manage, administer and consult data catalogs via the Web and, if necessary, allow access to the referenced data or documents. MDweb is based on international metadata standards (ISO 19115 and 19139) and communication standards (OGC – Catalogue service Specification – Z39.50) thus ensuring querying across catalogs. Specifications of the CSW-2 Web service will be implemented in a future version. MDweb today meets the requirements of metadata suppliers in the context of spatial data infrastructures such as the French geocatalog and geoportal initiatives and the European INSPIRE directive.

The genericity and the originality of the tool are, to a large extent, based on its database that accommodates any future changes in the standards by allowing the possibility of adding new standards, or modifying or extending existing ones, without impacting the schema and processes. The same is true of data-entry elements.

 

Centralized 3-tier architecture

 

3-tier architecture with distributed reference bases

 

With its feature of metadata profile, MDweb allows the cataloger to select and specify the properties of the metadata elements that will be used to document a reference. And, to improve search engine performance and to allow the user to better formulate his queries, a thesaurus database (used to help select and enter keywords) and a spatial database (used to help search by spatial selection) are included in MDweb and are customizable for the usage context. The connections between terms in the thesaurus are most notably used to propose expanded or refined search requests when a search request returns too few or too many responses.

MDweb is in current operational use in varied institutional and thematic settings, in France and abroad. The following are some of the users and uses of the tool: Desertification Observatory (ROSELT), integrated coastal zone management (PRCM in West Africa, SYSCOLAG in Languedoc-Roussillon), national inventory on the environment (SIA at Cape Verde), glaciology-paleoclimatology (Great-ICE IRD), hydrosystems (GIP Loire), natural hazards and territorial development, and in a large collection of territorial services and communities.

 

In its current version (v.1.5), MDweb consists of three standard modules that cover the entirety of its functional specifications:

 

- a management module which is password protected and is used for entering and updating metadata, data files and images attached to metadata, importing and exporting metadata in XML format, to manage predefined sets of values for automatizing the entry of some elements (contact details, technical elements of the standards)

- a search engine for searching for metadata using spatial selection (WMS, WMC cartographic clients), by data type, or via an advanced multi-criteria search mode.

- an administrative module (or reference-base manager) for setting the MDweb’s parameters, managing the metadata in existing catalogs, the customization of interfaces, the management of profiles, the management of spatial databases (support for WMS, WMC clients) and of thesauri, the configuration of the cartographic client (choice of layers and styles, SLD editor), and for the management of user accounts.

 

 

 

Back to the TOC

Concept of roles and associated functionalities

By design, MDweb is a multi-user application which necessitates the definition of clearly distinct roles for the use of its different functionalities. Each user is assigned an account to which a role is given by the administrator. This limits the user’s access to only those modules that correspond to the application tasks that have been assigned to him. Four roles are distinguishable:

- The administrator is the super-user. He can access all modules for managing the catalogs, users and the tool’s configuration.

- The validator is an expert responsible for validating the references’ contents. He thus becomes the guarantor of the quality of the references and data attached to the catalogs. To fulfil this role, this type of user will have access to all the references of a catalog irrespective of whether they were created by him or not.

- The cataloger describes the data. This role is most logically assigned to the producer of the data to be referenced. He is the most suitable person for describing the contents and characteristics of the data that he himself has produced. This type of user will have access to modules that allow the entry, updating and management of his own references.

- The final user has access to the catalog search-and-access module. Two cases can arise: if the user authenticates himself as a privileged user, he will obtain the rights to use the private-access module. If not, i.e., if he is an unauthenticated user, he will have rights only to a public search-and-access module.

 

 

Back to the TOC

Concept of template or metadata profile

A profile, or adaptation, is a document or schema (in the sense of a data structure) that specifies the implementation options of a standard for a particular purpose. In essence, a profile does not contradict the standard to which it refers and does not introduce, in principle, new concepts. Rather, it describes the standard or a sub-set of it so that it can be implemented and used in a particular context. However, elements that do not exist in the standard (extended elements) can be included in it. These description elements will complement the standard and will be useful in the specific context in which the profile is going to be used. In addition, a profile of a standard allows an international standard to be adapted culturally or linguistically for a particular national or regional context.

A community can thus define profiles for particular types of data sets. For example, a profile for matrix or ‘raster’ data sets will retain only those elements specific to this data type. A profile can also manage certain specifics or rules that an organization may want to apply to metadata elements. A profile, for example, could specify which elements are mandatory and which are optional in a metadata sheet.

 

MDweb includes 9 profiles as standard. They correspond to 9 data types:

  • Types of data series :

o    Geographical database or geodatabase

o    Temporal databases

o    Digital map

Types of data series :

o    hardcopy map

o    Vector layer

o    image – aerial picture

o    Text document

o    Spreadsheet data

o    Bibliographical references

 

Back to the TOC

Concept of resources

In principle, metadata standards, the international standard in particular, apply to digital data but they can also be applied to analogue documents such as maps, plans, aerial photographs, etc. In such cases, the documentation of the data and its cataloging always reference the actual document. Moreover, data sets of this type usually consist of a clearly identifiable collection of documents. On the other hand, for digital data, the definition of what is data, or a data set, is more difficult and often depends on the institutional or technological context of the organization that produced the data. In general, digital data can be broken up into a hierarchy going from data attribute to entity type to data set to, finally, data series. This perspective of data can be more simply described with the general term ‘resource’. It covers all the concepts associated with the data hierarchy shown in the figure.

To illustrate this concept, we have chosen as example the land use maps of a territory, in this case that of Oued Mird (Morocco). This resource, of type ‘digital map’, can be broken up into the hierarchy of resources mentioned above in a perspective of UML formalism. If we consider the highest level, we can speak of a data series. This is represented here by the collection of maps on the same theme but produced during different observation periods, those for land use in the 1990s, those for land use in the 2000 decade, etc. At the data set level, we will consider just one item from this collection, for example, the land use map of the 1990s. The next lower level, entity type, will correspond to all the thematic layers that make up the land use map of the 1990s. In our example, we have selected the ‘polygon’ layer of land-use classes. Other layers, such as the village layer, can be part of the map. Finally, the most basic level, or attribute type, is the set of properties of the ‘polygon’ layer. An example of this attribute type is given by the attribute ‘percentage of ligneous cover’.

The levels handled by MDweb are limited to:

      • data series
      • data sets

Definitions

Data series: A collection of distinct data sets related to each other by common characteristics such as their mode of acquisition or processing (satellite images), their spatial extent, the type of their contents, for example, a data series is synonymous with a data collection. This denomination is used in MDweb for the data types: ‘digital map’, ‘geodatabase’ and ‘Temporal database’.

Data set: Set of related data, unmistakeably identifiable as connected to each other by common characteristics such as their mode of acquisition or processing, their spatial extent, etc. A data set can be considered as a small set of data or a sub-set of it. This denomination is used in MDweb for the data types: ‘hardcopy map’, ‘vector layer, ‘image – aerial photo’, ‘text document’, ‘Spreadsheet data’ and ‘bibliographical reference’.

Hierarchy between data series and data sets

MDweb establishes a hierarchy between data types using the concept of parent and child profiles (see Concept of profile or metadata profile).

In the standard version, this is the hierarchy:

 

 

Back to the TOC

Concept of metadata sheets or references

 

In this document, the concepts of the metadata sheet and of the reference are used in the same way. They both apply to the same object. A metadata sheet or reference is defined as a set of metadata elements filled in by a user to describe a data collection or data set or, more generally, a resource.

The concept of a metadata sheet relates to the structure and nature of the metadata elements that it consists of, with these elements originating from the ISO 19115 standard.

The concept of a reference additionally relates to a perspective of metadata as an item of a data catalog managed by MDweb.

Concept of data-entry levels of a metadata sheet

The data-entry level relates to the number of elements (and their characteristics) used for describing a resource. It corresponds to different levels of metadata usage. In fact, the information required to describe a resource depends on the purpose of the metadata usage. For example, for purposes of searching and locating resources, information that is less detailed and less complete will suffice as compared to for documentation purposes, which will need greater detail and completeness because resources will need to be distributed and transferred. Thus, for cataloging of resources, which is the basis of searches for them, simplified metadata could be sufficient.

These different contexts or levels of metadata usage can lead to the definition of several metadata-detail levels. The international standard defines two levels of details or conformity. The first conformity level or ‘basic’ level corresponds to the purposes of resource cataloging. For this, it proposes a set of mandatory elements or ‘metadata core profile’ that consists of elements necessary to identify the resource and to provide a summary of its contents. It can only be used for cataloging services and as a basis for metadata services designed for locating resources. A second conformity level or ‘complete’ level includes metadata elements necessary to fully document a resource. This conformity level defines metadata elements necessary to identify, evaluate, extract, use and manage geographic resources.

On the basis of the international standard’s definitions, we have identified three levels of detail in the profiles for the metadata:

ü a basic level,

ü an extended level,

ü and a complete level.

The basic level is based on the minimum metadata elements specified in the standard.

The extended level is based on the basic level and additionally includes those metadata elements that would allow the exchange and transfer of the resource and the accurate description of the resource’s origins (source data and processes used). This latter requirement is essential for the reuse of a resource for scientific purposes. For data types offered in the standard MDweb version, the extended and the complete levels are one and the same.


 

Contacts

 

 

IRD / ESPACE unit (US 140)

500, rue Jean François Breton, 34093 Montpellier Cedex 05, France

TEL: +33 (0)4 67 54 87 02

J.C Desconnets jcd@teledetection.fr

 

MDweb project site: www.mdweb-project.org

Online demo: demo16.mdweb-project.org