Mapping transformations

The org.incenp.obofoundry.sssom.transform package provides classes and interfaces intended to allow transforming SSSOM mappings into other objects (including other SSSOM mappings).

1. Building blocks

A mapping filter, represented by the IMappingFilter interface, allows to select mappings. The interface defines a single method filter() which takes a mapping and returns a boolean value. Implementations of that interface can test any of the mapping metadata slot to decide whether to “accept” the mapping (by returning true) or “reject” it (by returning false).

For example, here is a lambda implementation that filters out any mapping that is not using a skos:exactMatch predicate:

IMappingFilter filter = (mapping) -> mapping.getPredicateId().equals("http://www.w3.org/2004/02/skos/core#exactMatch");

A mapping transformer, represented by the IMappingTransformer<T> interface, transforms a mapping into something else. The interface is generic, where the generic parameter represents the type of objects the transformer generates from a mapping. It defines a single method transform(), which takes a mapping and should return an object of the generic parameter type T.

The parameter type can itself be a mapping, leading to a particular case of transformer that takes a mapping and returns another mapping. For example, here is an implementation that takes a mapping and returns an inverted copy of it:

IMappingTransformer<Mapping> inverter = (mapping) -> mapping.invert();

2. Processing rule

Mapping filters and transformers can be used on their own. However, they are also intended to be used as building blocks to create processing rules, which can themselves be used to apply arbitrarily complex treatments to a set of mappings.

A processing rule, represented by the MappingProcessingRule<T> class, is made of four elements:

  • a filter (a IMappingFilter object) that decides whether the rule applies to a given mapping;
  • a preprocessor (a IMappingTransformer<Mapping>​, that is a transformer that produces mappings from mappings) that can be used to apply a pre-treatment on the mapping before the rest of the rule is applied;
  • a generator (a IMappingTransformer<T>​ object, with the same parameter type as the MappingProcessingRule itself) that will take the filtered, preprocessed mapping and return an object derived from the mapping;
  • a callback method (a IMappingProcessorCallback object), which may be used to implement custom treatments with possible side-effects;

All four elements are optional. If no filter is set, a rule will apply to all mappings; if no preprocessor is set, the generator will be applied to the original, unmodified mapping; if no generator is set, the rule will not produce anything; if no callback set, no custom treatment will be applied. Obviously a rule with no preprocessor and no generator and no callback does not make any sense. A rule with a preprocessor only does make sense, though, because even if there is no generator in that particular rule, the result of the preprocessing will be visible to any following rules.

A rule can also have tags, which are simply String objects. Tags have no effect on the behaviour of the rule, but they may be used in the context of the mapping processor (see below) for two purposes:

  • selectively enable or disable some rules;
  • identify which rule generated a given object.

3. The mapping processor

Now that we have rules, we need to apply them to mappings. That is the job of the mapping processor (MappingProcessor<T> class). Given a list of rules, the processor will apply them to a list of mappings as follows:

  1. For each rule in the list:
    1. If the rule has a callback, invoke that callback with two arguments: (1) the rule’s filter (if any), and the current set of mappings.
    2. For each mapping in the list:
      1. Apply the filter to the mapping. If there is no filter, this is equivalent to a filter that accepts any mapping.
      2. If the filter rejects the mapping, skip to the next mapping, otherwise continue.
      3. Apply the preprocessor to the mapping. If there is no preprocessor, this is equivalent to a preprocessor that returns the original mapping unmodified.
      4. If the preprocessor returns null, drop the mapping from the list (meaning that no further rule will be applied to this mapping) and skip to the next mapping; otherwise, set the current mapping to the one returned by the preprocessor and continue.
      5. If there is a generator, apply it to the mapping; if it returns a non-null value (hereafter called a product), collect it.

Once all rules have been applied, the processor will return a list of the products collected at step 1.2.5.

The following example (somewhat contrived) shows a processor that generates text representations of mappings (as String objects).

MappingProcessor<String> processor = new MappingProcessor<String>();

// Adds a rule that excludes any mapping with a predicate other than
// skos:exactMatch.
processor.addRule(new MappingProcessingRule<String>(
    // filter: select mappings with a predicate other than skos:exactMatch
    (mapping) -> !mapping.getPredicateId().equals("http://www.w3.org/2004/02/skos/core#exactMatch"),
    // preprocessor: return null for all selected mappings
    (mapping) -> null,
    // no generator
    null));
    
// All following rules will only be applied to mappings with a
// predicate of skos:exactMatch, so there's no more need to filter
// on the predicate.

// Adds a rule that will generate a String to highlight high-confidence
// mappings.
processor.addRule(new MappingProcessingRule<String>(
    // filter: select mappings with a high confidence value
    (mapping) -> mapping.getConfidence() >= 0.95,
    // no preprocessor
    null,
    // generator
    (mapping) -> String.format("I am very confident the mapping between %s and %s is correct.", mapping.getSubjectId(), mapping.getObjectId())));
    
// Adds a rule that simply produces a text representation of a mapping.
processor.addRule(new MappingProcessingRule<String>(
    // no filter (we could also explicitly use: (mapping) -> true)
    null,
    // no preprocessor
    null,
    // generator
    (mapping) -> String.format("%s is the same thing as %s.", mapping.getSubjectId(), mapping.getObjectId())));
        
// Run the processor
List<String> mappingTexts = processor.process(mappings);

To tag a rule, create the rule first (without adding it immediately to a processor) and access the tags set:

MappingProcessingRule<String> exactMatchOnly = new MappingProcessingRule<String>(
    (mapping) -> !mapping.getPredicateId().equals("http://www.w3.org/2004/02/skos/core#exactMatch"),
    (mapping) -> null,
    null);
exactMatchOnly.getTags().add("exact-match-only");

When rules are tagged, use the processor’s includeRules() method to instruct the processor to use only some rules for the next call to process(). More precisly, only the rules that have at least one of the specified tags will be run, all other rules will be ignored.

Conversely, use the excludeRules() method to force the processor to ignore any rule that has at least one of the specified tag.

The processor can emit a “generated” event every time a rule generates a product. Client code can react to such events by registering a IMappingProcessorListener object. That interface defines a single method that accepts three parameters:

  • the rule that generated the product;
  • the mapping the rule was being applied to;
  • the product that was generated from that mapping.

4. Reading rules from a file in SSSOM/Transform

While processing rules can be created “manually” by calling the MapppingProcessingRule constructor as in the example above, they can also be parsed from a file in the SSSOM/Transform language (SSSOM/T), a small domain-specific language specifically intended to express them. The SSSOMTransformReader class provides a parser for the SSSOM/Transform language.

Note that the SSSOM/Transform language is slightly more restrictive than what the mapping processor model allows. In the mapping processor model, a processing rule can simultaneously have a callback function, a preprocessor function, and a generator function. In SSSOM/Transform, a rule can only have either a callback function, or a preprocessor function, or a generator function.

The parser is independent of the type of objects one wishes to derive from mappings, and knows nothing of the functions that can be used in rules. Therefore, in order to use the parser, you must first define your own ”dialect” of SSSOM/Transform (called a SSSOM/Transform application) – basically, the function(s) that you want to use to produce whatever you need to produce from mappings. Once defined, your dialect must be formally described as an implementation of the ISSSOMTransformApplication interface. The parser must then be provided with an instance of that implementation, and will call its various methods during parsing.

Please refer to the API documentation of the ISSSOMTransformApplication for more details. To facilitate the creation of a SSSOM/Transform application, a base implementation is also available. It supports all functions defined in the description of the SSSOM/T base language, and allows to register new application-specific functions which are each represented by implementations of the ISSSOMTFunction interface.