Skip to content

How to publish a spell check dictionary as plugin

Overview

OmegaT provides a feature for translators to check their translations using spell-check dictionaries. Developers can enhance this functionality by creating custom spell-check dictionary plugins. These plugins must implement the ISpellCheckDictionary interface, which defines the necessary methods to integrate with OmegaT. OmegaT also provides abstract classes for plugins. There are AbstractHunspellDictionary and AbstractMorfologikDictinonary abstract classes.

This document provides guidance on creating a plugin, explaining the purpose of the interface methods and abstract methods, and giving practical tips for implementation.

ISpellCheckDictionary Interface

The ISpellCheckDictionary interface defines methods for integrating custom spell-check dictionaries. Implementing this interface allows your plugin to support different dictionary types, such as Hunspell and Morfologik.

public interface ISpellCheckerDictionary extends Closeable {
    /**
     * Get Hunspell dictionary.
     * 
     * @return Dictionary object when the language module has. Otherwise, null.
     */
    default org.apache.lucene.analysis.hunspell.Dictionary getHunspellDictionary(String language) {
        return null;
    }

    /**
     * Get Morfologik dictionary.
     * 
     * @return Dictionary object when the language module has. Otherwise, null.
     */
    default morfologik.stemming.Dictionary getMorfologikDictionary(String language) {
        return null;
    }

    default Path installHunspellDictionary(Path dictionaryDir, String language) {
        return null;
    }

    /**
     * Get a dictionary type.
     * 
     * @return type of dictionary. If the module provides nothing, return null.
     */
    SpellCheckDictionaryType getDictionaryType();
}

Method Descriptions

  1. getHunspellDictionary(String language)

    • Purpose: Provides access to a Hunspell dictionary for the specified language.
    • Return Value:
      • A Dictionary object if the language module supports Hunspell.
      • null if Hunspell is not supported.

Example Usage:

@Override
public org.apache.lucene.analysis.hunspell.Dictionary getHunspellDictionary(String language) {
    // Load and return the Hunspell dictionary for the specified language.
}

  1. getMorfologikDictionary(String language)

    • Purpose: Provides access to a Morfologik dictionary for the specified language.
    • Return Value:
      • A Dictionary object if the language module supports Morfologik.
      • null if Morfologik is not supported.

Example Usage:

@Override
public morfologik.stemming.Dictionary getMorfologikDictionary(String language) {
    // Load and return the Morfologik dictionary for the specified language.
}

  1. installHunspellDictionary(Path dictionaryDir, String language)

    • Purpose: Installs a Hunspell dictionary for the specified language in a given directory.
    • Parameters:
      • dictionaryDir: The directory where the dictionary will be installed.
      • language: The language code for the dictionary.
    • Return Value:
      • The path to the installed dictionary.
      • null if installation is not supported.

Example Usage:

@Override
public Path installHunspellDictionary(Path dictionaryDir, String language) {
    // Logic to download or copy the Hunspell dictionary into dictionaryDir.
}

  1. getDictionaryType()

    • Purpose: Specifies the type of dictionary supported by the plugin.
    • Return Value:
      • A SpellCheckDictionaryType enum value, e.g., HUNSPELL, MORFOLOGIK.
      • null if no dictionary type is provided.

Example Usage:

@Override
public SpellCheckDictionaryType getDictionaryType() {
    return SpellCheckDictionaryType.HUNSPELL;
}

Example Implementation

public class MyHunspellDictionaryPlugin implements ISpellCheckDictionary {

    @Override
    public org.apache.lucene.analysis.hunspell.Dictionary getHunspellDictionary(String language) {
        // Load and return the Hunspell dictionary for the specified language.
        return new org.apache.lucene.analysis.hunspell.Dictionary(...);
    }

    @Override
    public SpellCheckDictionaryType getDictionaryType() {
        return SpellCheckDictionaryType.HUNSPELL;
    }

    @Override
    public void close() {
        // Cleanup resources if needed.
    }
}

Creating a Hunspell Spell-Check Dictionary Plugin

OmegaT provides an abstract class, AbstractHunspellDictionary, to simplify the process of implementing a Hunspell-based spell-check dictionary. Developers can use this class to create plugins that support specific languages by implementing a minimal set of methods.

This document provides guidance on using AbstractHunspellDictionary, including method descriptions, implementation steps, and a complete example for a Catalan Hunspell dictionary.

Abstract Class: AbstractHunspellDictionary

The AbstractHunspellDictionary class implements the ISpellCheckDictionary interface and includes additional utilities for managing Hunspell dictionaries. Developers need to subclass this abstract class and implement its key methods to provide language-specific dictionary support.

Key Features of AbstractHunspellDictionary

  1. Dictionary Management

    • Locates and loads Hunspell .aff and .dic files.
    • Provides access to the Hunspell dictionary for a given language.
  2. Helper Methods

    • protected abstract String[] getDictionaries()
      • Returns the list of supported language codes for the dictionary.
    • protected String getDictionary(String language)
      • Finds the appropriate dictionary for a given language.
    • protected abstract InputStream getResourceAsStream(String resource)
      • Retrieves the dictionary resource stream.
  3. Predefined Implementation of ISpellCheckDictionary Methods

    • getHunspellDictionary(String language)
      • Loads the Hunspell dictionary for the specified language.
    • installHunspellDictionary(Path dictionaryDir, String language)
      • Installs the Hunspell dictionary files in a specified directory.
    • getDictionaryType()
      • Returns SpellCheckDictionaryType.HUNSPELL.
    • close()
      • Closes any open streams to release resources.

Implementation Steps

  1. Subclass AbstractHunspellDictionary

    • Create a new class extending AbstractHunspellDictionary.
  2. Implement Required Methods

    • Define the supported language codes in getDictionaries().
    • Provide logic to retrieve resource streams for dictionary files in getResourceAsStream(String resource).
  3. Package the Implementation

    • Include your dictionary files (.aff and .dic) in the project resources directory.
    • Package the implementation class as a plugin (e.g., a JAR file).

Example: Catalan Hunspell Dictionary

Below is a complete implementation of a Catalan Hunspell dictionary plugin using AbstractHunspellDictionary.

Dictionary Files

Ensure the following files are placed in the resources directory: - ca.aff - ca.dic

Implementation

public class CatalanHunspellDictionary extends AbstractHunspellDictionary {

    // Supported language codes
    private static final String[] HUNSPELL = { "ca" };

    /**
     * Provides the list of supported languages.
     * @return an array of language codes.
     */
    @Override
    protected String[] getDictionaries() {
        return HUNSPELL;
    }

    /**
     * Retrieves the resource stream for a given dictionary file.
     * @param resource the resource file name.
     * @return an InputStream for the resource.
     */
    @Override
    protected InputStream getResourceAsStream(final String resource) {
        return getClass().getResourceAsStream(resource);
    }
}

Conclusion

The AbstractHunspellDictionary class reduces the complexity of implementing Hunspell dictionaries. By following the steps and using the provided example, developers can quickly create plugins for specific languages. By implementing the ISpellCheckDictionary interface, developers can extend OmegaT’s functionality, enabling support for additional spell-checking languages or dictionary types.