gluetool features

A comprehensive list of gluetool features, available helpers, tricks and tips. All the ways gluetool have to help module developers.

Core

Module and gluetool configuration

Configuration of gluetool and every module is gathered from different sources of different priorities, and merged into a single store, accessible by option() method. Configuration from later sources replaces values set by earlier sources, with lower priority. That way it is possible to combine multiple configuration files for a module, e.g. a generic site-wide configuration, with user-specific configuration overriding the global settings. Options specified on a command-line have the highest priority, overriding all configuration files.

Consider following example module - it has just a single option, whom, whose value is logged in a form of greeting. The option has a default value, unknown being:

from gluetool import Module

class M(Module):
    name = 'dummy-module'

    options = {
        'whom': {
           'default': 'unknown being'
        }
    }

    def execute(self):
        self.info('Hi, {}!'.format(self.option('whom')))

With a configuration file, ~/.gluetool.d/config/dummy-module, you can change the value of whom:

[default]
whom = happz

As you can see, configuration file for dummy-module is loaded and option() method returns the correct value, happz.

Options specified on a command-line are merged into the store transparently, without any additional action necessary:

Todo

  • re-record video because of name => whom
  • seealso:
    • options definitions

See also

Configuration files
to see what configuration files are examined.

Configuration files

For every module - including gluetool itself as well - gluetool checks several possible sources of configuration, merging all information found into a single configuration store, which can be queried during runtime using option() method.

Configuration files follow simple INI format, with a single section called [default], containing all options:

[default]
option-foo = value bar

Warning

Options can have short and long names (e.g. -v vs. --verbose). Configuration files are using only the long option names to propagate their values to gluetool. If you use a short name (e.g. v = yes), such setting won’t affect gluetool behavior!

These files are checked for gluetool configuration:

  • /etc/gluetool.d/gluetool
  • ~/.gluetool.d/gluetool
  • ./.gluetool.d/gluetool
  • options specified on a command-line

These files are checked for module configuration:

  • /etc/gluetool.d/config/<module name>
  • ~/.gluetool.d/config/<module name>
  • ./.gluetool.d/config/<module name>
  • options specified on a command-line

If you’re using a tool derived from gluetool, it may add its own set of directories, e.g. using its name insead of gluetool, but lists mentioned above should be honored by such tool anyway, to stay compatible with the base gluetool.

It is possible to change the list of directories, using --module-config-path option, the default list mentioned above is then replaced by directories provided by this option.

Todo

  • seealso:
    • option definitions

See also

Module and gluetool configuration
for more information on configuration handling.
Module aliases
for more information on module names and how to rename them

Module aliases

Each module has a name, as set by its name class attribute, but sometimes it might be good to use the module under another name. Remember, the module configuration is loaded from files named just like the module, and if there’s a way to “rename” module when used in different pipelines, user might use different configuration files for the same module.

Consider following example module - it has just a single option, whom, whose value is logged in a form of greeting:

from gluetool import Module

class M(Module):
    name = 'dummy-module'

     options = {
         'whom': {}
     }

     def execute(self):
         self.info('Hi, {}!'.format(self.option('whom')))

With the following configuration, ~/.gluetool.d/config/dummy-module, it will greet your users in a more friendly fashion:

[default]
whom = handsome gluetool user

For some reason, you might wish to use the module in another pipeline, sharing the configuration between both pipelines, but you want to change the greeted entity. One option is to use a command-line option, which overrides configuration files but that would make one of your pipelines a bit exceptional, having some extra command-line stuff. Other way is to tell gluetool to use the module but give it a different name. Add the extra configuration file for your “renamed” module, ~/.gluetool.d/config/customized-dummy-module:

[default]
whom = beautiful

Module named customized-dummy-module:dummy-module does not exist but this form tells gluetool it should create an instance of dummy-module module, and name it customized-dummy-module. This is the name used to find and load module’s configuration.

You may combine aliases and original modules as much as you wish - gluetool will keep track of names and the actual modules, and it will load the correct configuration:

Todo

  • re-record video because of name => whom

Evaluation context

gluetool and its modules rely heavily on separating code from configuration, offloading things to easily editable files instead of hard-coding them into module sources. Values in configuration files can often be seen as templates, which need a bit of “polishing” to fill in missing bits that depend on the actual state of a pipeline and resources it operates on. To let modules easily participate and use information encapsulated in other modules in the pipeline, gluetool uses concept called evaluation context - a module can provide a set of variables it thinks might be interesting to other modules. These variables are collected over all modules in the pipeline, and made available as a “context”, mapping of variable names and their values, which is a form generaly understood by pretty much any functionality that evaluates things, like templating engines.

To provide evaluation context, module has to define a property named eval_context. This property should return a mapping of variable names and their values.

For example:

from gluetool import Module
from gluetool.utils import render_template

class M(Module):
    name = 'dummy-module'

    @property
    def eval_context(self):
        return {
            'FOO': id(self)
        }

    def execute(self):
        context = self.shared('eval_context')
        self.info('Known variables: {}'.format(', '.join(context.keys())))

        message = render_template('Absolutely useless ID of this module is {{ FOO }}', **context)

        self.info(message)

It provides an interesting information to other modules - named FOO - for use in templates and other forms of runtime evaluation. To get access to the global context, collected from all modules, shared function eval_context is called.

Expected output:

[12:48:41] [+] [dummy-module] Known variables: FOO, ENV
[12:48:41] [+] [dummy-module] Absolutely useless ID of this module is 139695598692432

Note

Modules are asked to provide their context in the same order they are listed in the pipeline, and their contexts are merged, after each query, into a single mapping. It is therefore easy to overwrite variables provided by modules that were queried earlier by simply providing the same variable with a different value.

Note

It is a good practice to prefix names of provided variables, to make them module specific and avoid confusion when it comes to names that might be considered too generic. E.g. variable ID is probably way too universal - is it a user ID, or a task ID? Instead, USER_ID or ARTIFACT_OWNER_ID is much better.

Todo

  • seealso:
    • rendering templates

Long and short option names

When specifying options on a command-line, each option can be set using its name: --foo for option named foo. Historicaly, it is also common to use “short” variants of option names, using just a single character. For example, --help and -h control the same thing. By default, each option defined by a module is a “long” one, suitable for use in a --foo form. If developer wishes to enable short form as well, he can simply express this wish by using both variants when defining the option, grouping them in a tuple.

Consider following example module - it has just a single option, whom, whose value is logged in a form of greeting. It is possible to use --whom or -w to control the value.

from gluetool import Module

class M(Module):
    name = 'dummy-module'

    options = {
        ('w', 'whom'): {}
    }

    def execute(self):
        self.info('Hi, {}!'.format(self.option('whom')))

Note

Configuration files deal with “long” option names only. I.e. whom = handsome will be correctly propagated into module’s configuration store while w = handsome won’t.


Todo

Features yet to describe:

  • system-level, user-level and local dir configs
  • configurable list of module paths (with default based on sys.prefix)
  • dry-run support
  • controled by core
  • module can check what level is set, and take corresponding action. core takes care of logging
  • exception hierarchy
  • hard vs soft errors
  • chaining supported
  • custom sentry fingerprint and tags
  • Failure class to pass by internally
  • processes config file, command line options
  • argparser to configure option
  • option groups
  • required options
  • note to print as a part of help
  • shared functions
  • overloaded shared
  • require_shared
  • module logging helpers
  • sanity => execute => destroy - pipeline flow
  • failure access
  • module discovery mechanism

Help

gluetool tries hard to simplify writing of consistent and useful help for modules, their shared functions, options and, of course, a source code. Its markup syntax of choice is reStructured (reST), which is being used in all docstrings. Sphinx is then used to generate documentation from documents and source code.

Module help

Every module supports a command-line option -h or --help that prints information on module’s usage on terminal. To provide as much information on module’s “public API”, several sources are taken into account when generating the overall help for the module. Use of reST syntaxt is supported by each of them, that should allow authors to highligh important bits or syntax.

module’s docstring
Developer should describe module’s purpose, use cases, configuration files and their syntax. Bear in mind that this is the text an end user would read to find out how to use the module, how to configure it and what they should expect from it. Feel free to use reST to include code blocks, emphasize importat bits and so on.
module’s options
Every option module has should have its own help set, using help key. These texts are gathered.
module’s shared functions
If the module provides shared functions, their signatures and help texts are gathered.
module’s evaluation context
If the module provides an evaluation context, description for each of its variables is extracted.

All parts are put together, formatted properly, and printed out to terminal in response to --help option.

Example:

from gluetool import Module

class M(Module):
    """
    This module greets its user.

    See ``--whom`` option.
    """

    name = 'dummy-module'

    options = {
        'whom': {
            'help': 'Greet our caller, whose NAME we are told by this option.',
            'default': 'unknown being',
            'metavar': 'NAME'
        }
    }

    shared_functions = ('hello',)

    def hello(self, name):
        """
        Say "Hi!" to someone.

        :param str name: Name of entity we're supposed to greet.
        """

        self.info('Hi, {}!'.format(name))

    @property
    def eval_context(self):
        __content__ = {
            'NAME': 'Name of entity this module should greet.'
        }

        return {
            'NAME': self.option('whom')
        }

    def execute(self):
        self.hello(self.option('whom'))

Todo

  • run example with a gluetool supporting eval context help
  • seealso:
    • module options
    • shared functions
    • shared functions help
    • eval context
    • help colors

Todo

Features yet to describe:

  • modules, shared functions, etc. help strings generated from their docstrings
  • options help from their definitions in self.options
  • RST formatting supported and evaluated before printing
  • colorized to highlight RST
  • keeps track of terminal width, tries to fit in

Logging

Early debug messages

Default logging level is set to INFO. While debugging actions happening early in pipeline workflow, like module discovery and loading, it may be useful to enable more verbose logging. Unfortunatelly, this feature is controlled by a debug option, and this option will be taken into account too late to shed light on your problem. For that case, it is possible to tell gluetool to enable debug logging right from its beginning, by setting an environment variable GLUETOOL_DEBUG to any value:

export GLUETOOL_DEBUG=does-not-matter
gluetool -l

As you can see, gluetool dumps much more verbose logging messages - about processing of options, config files and other stuff - on terminal with the variable set.

Note

You can set the variable in any way supported by your shell, session or environment in general. The only important thing is that such variable must exist when gluetool starts.

Logging of structured data

To format structured data, like lists, tuples and dictionaries, for output, use gluetool.log.format_dict()

Example:

import gluetool

print gluetool.log.format_dict([1, 2, (3, 4)])

Output:

[
    1,
    2,
    [
        3,
        4,
    ]
]

To actually log structured data, the gluetool.log.log_dict() helper is a nice shortcut.

Example:

import gluetool

logger = gluetool.log.Logging.create_logger()

gluetool.log.log_dict(logger.info, 'logging structured data', [1, 2, (3, 4)])

Output:

[14:43:03] [+] logging structured data:
[
    1,
    2,
    [
        3,
        4
    ]
]

The first parameter of log_dict expects a callback which is given the formatted data to actually log them. It is therefore easy to use log_dict on every level of your code, e.g. in methods of your module, just give it proper callback, like self.info.

Todo

  • seealso:
    • logging helpers
    • connecting loggers

See also

Logging of unstructured blobs of text
to find out how to log text blobs.
gluetool.log.format_dict(), gluetool.log.log_dict()
for developer documentation.

Logging of unstructured blobs of text

To format a “blob” of text, without any apparent structure other than new-lines and similar markings, use gluetool.log.format_blob():

It will preserve text formatting over multiple lines, and it will add borders to allow easy separation of the blob from neighbouring text.

To actually log a blob of text, gluetool.log.log_blob() is a shortcut:

The first parameter of log_blob expects a callback which is given the formatted data to actually log them. It is therefore easy to use log_blob on every level of your code, e.g. in methods of your module, just give it proper callback, like self.info.

Todo

  • seealso:
    • logging helpers
    • connecting loggers

See also

Logging of structured data
to find out how to log structured data.
gluetool.log.format_blob(), gluetool.log.log_blob()
for developer documentation.

Logging of XML elements

To format an XML element, use gluetool.log.format_xml():

It will indent nested elements, presenting the tree in a more readable form.

To actually log an XML element, gluetool.log.log_xml() is a shortcut:

The first parameter of log_xml expects a callback which is given the formatted data to actually log them. It is therefore easy to use log_xml on every level of your code, e.g. in methods of your module, just give it proper callback, like self.info.

Todo

  • seealso:
    • logging helpers
    • connecting loggers

See also

Logging of structured data
to find out how to log structured data.
gluetool.log.format_blob(), gluetool.log.log_blob()
for developer documentation.

Object logging helpers

Note

When we talk about logger, we mean it as a description - an object that has logging methods we can use. It’s not necessarilly the instance of logging.Logger - in fact, given how logging part of gluetool works, it is most likely it’s an instance of gluetool.logging.ContextAdapter. But that is not important, the API - logging methods like info or error are available in such “logger” object, no matter what its class is.

Python’s logging system provides a log function for each major log level, usually named by its corresponding level in lowercase, e.g. debug or info. These are reachable as methods of a logger (or logging context adapter) instance. If you have a class which is given a logger, to ease access to these methods, it is possible to “connect” the logger and your class, making logger’s debug & co. direct members of your objects, allowing you to call self.debug, for example.

Example:

from gluetool.log import Logging, ContextAdapter

logger = ContextAdapter(Logging.create_logger())

class Foo(object):
    def __init__(self, logger):
        logger.connect(self)

Foo(logger).info('a message')

Output:

[10:01:15] [+] a message

All standard logging method debug, info, warn, error and exception are made available after connecting a logger.

Todo

  • seealso:
    • context adapter

See also

logging.Logger.debug()
for logging methods.

Todo

Features yet to describe:

  • clear separation of logging records, making it visible where each of them starts and what is a log message and what a logged blob of command output
  • default log level controlled by env var
  • warn(sentry=True)
  • verbose, readable, formatted traceback logging
  • using context adapters to add “structure” to loged messages
  • colorized messages based on their level
  • optional “log everything” dump in a file
  • correct and readable logging of exception chains

Colorized output

gluetool uses awesome colorama library to enhance many of its outputs with colors. This is done in a transparent way, when developer does not need to think about it, and user can control this feature with a single option.

Control

Color support is disabled by default, and can be turned on using --color option:

If colorama package is not installed, color support cannot be turned on. If user tries to do that, gluetool will emit a warning:

Note

As of now, colorama is gluetool‘s hard requirement, therefore it should not be possible - at least out of the box - to run gluetool wihout having colorama installed. However, this may change in the future, leaving this support up to user decision.

To control this feature programatically, see gluetool.color.switch().

Todo

  • seealso:
    • how to specify options

Colorized logs

Messages, logged on the terminal, are colorized based on their level:

DEBUG log level inherits default text color of your terminal, while, for example, ERROR is highlighted by being red, and INFO level is printed with nice, comforting green.

Todo

  • seealso:
    • logging

Colorized help

gluetool uses reStructuredText (reST) to document modules, shared functions, opitons and other things, and to make the help texts even more readable, formatting, provided by reST, is enhanced with colors, to help users orient and focus on important information.

Todo

  • seealso:
    • generic help
    • module help
    • option help

Colors in templates

Color support is available for templates as well, via style filter.

Example:

import gluetool

gluetool.log.Logging.create_logger()
gluetool.color.switch(True)

print gluetool.utils.render_template('{{ "foo" | style(fg="red", bg="green") }}')

See also

Rendering templates
for more information about rendering templates with gluetool.

Sentry integration

gluetool integrates easily with Sentry platform, simplifying the collection of trouble issues, code crashes, warnings and other important events your deployed code produces. This integration is optional - it must be explicitly enabled - and transparent - it is not necessary to report common events, like exceptions.

When enabled, every unhandled exception is automatically reported to Sentry. Helpers for explicit reporting of handled exceptions and warnings are available, as well as the bare method for reporting arbitrary events.

Control

Sentry integration is controlled by environmental variables. It must be possible to configure itilable even before gluetool has a chance to process given options. To enable Sentry integration, one has to set at least SENTRY_DSN variable:

export SENTRY_DSN="https://<key>:<secret>@sentry.io/<project>"

This variable tells Sentry-related code where it should report the events. Without this variable set, Sentry integration is disabled. All relevant functions still can be called but do not report any events to Sentry, since they don’t know where to send their reports.

See also

About the DSN
for detaield information on Sentry DSN and their use.
gluetool.sentry module
for developer documentation.

Sentry tags & environment variables

Sentry allows attaching “tags” to reported events. To use environment variables as such tags, set SENTRY_TAG_MAP variable. It lists comma-separated pairs of names, tag and its source variable.

export SENTRY_TAG_MAP="username=USER,hostname=HOSTNAME"

Should there be an event to report, integration code will attach 2 labels to it, username and hostname, using environmen variables USER and HOSTNAME respectively as source of values.

See also

Tagging Events
for detailed information on event tags.

Logging of submitted events

Integration code is able to log every reported event. To enable this feature, simply set SENTRY_BASE_URL environment variable to URL of the project gluetool is reporting events to. While SENTRY_DSN controls the whole integration and has its meaning within Sentry server your gluetool runs report to, SENTRY_BASE_URL is used only in a cosmetic way and gluetool code adds ID of reported event to it. The resulting URL, if followed, should lead to your project and the relevant event.

As you can see, the exception, raised by gluetool when there were no command-line options telling it what to do, has been submitted to Sentry, and immediately logged, with ERROR loglevel.

See also

Sentry - Control
for more information about Sentry integration.

Warnings

By default, only unhandled exceptions are submitted to Sentry. it is however possible, among others, to submit warnings, e.g. in case when such warning is good to capture yet it is not necessary to raise an exception and kill the whole gluetool pipeline. For that case, warn logging method accepts sentry keword parameter, which, when set to True, uses Sentry-related code to submit given message to the configured Sentry instance. It is also always logged like any other warning.

Example:

from gluetool.log import Logging

logger = Logging.create_logger()

logger.warn('foo', sentry=True)

Output:

[17:16:50] [W] foo

Todo

  • video

See also

Object logging helpers
for more information on logging methods.
gluetool.log.warn_sentry()
for developer documentation.

Todo

Features yet to describe:

  • all env variables are attached to events (breadcrumbs)
  • logging records are attached to events (breadcrumbs)
  • URL of every reported event available for examination by code
  • soft-error tag for failure.soft errors
  • raised exceptions can provide custom fingerprints and tags
  • submit_exception and submit_warning for explicit submissions
  • logger.warn(sentry=True)

Utils

Rendering templates

gluetool and its modules make heavy use of Jinja2 templates. To help with their processing, it provides gluetool.utils.render_template() helper which accepts both raw string templates and instances of jinja2.Template, and renders it with given context variables. Added value is uniform logging of template and used variables.

Example:

import gluetool

gluetool.log.Logging.create_logger()

print gluetool.utils.render_template('Say hi to {{ user }}', user='happz')

Output:

Say hi to happz

See also

Jinja2 templates
for information about this fast & modern templating engine.
gluetool.utils.render_template()
for developer documentation.
Colors in templates
for using colors in templates

Normalize URL

URLs, comming from different systems, or created by joining their parts, might contain redundant bits, duplicities, multiple .. entries, mix of uppercase and lowercase characters and similar stuff. Such URLs are not verry pretty. To “prettify” your URLs, use gluetool.utils.treat_url():

For example:

from gluetool.log import Logging
from gluetool.utils import treat_url

print treat_url('HTTP://FoO.bAr.coM././foo/././../foo/index.html')

Output:

http://foo.bar.com/foo/index.html

Todo

Features yet to describe:

  • dict_update
  • converting various command-line options to unified output
  • boolean switches via normalize_bool_option
  • multiple string values (–foo A,B –foo C => [A,B,C])
  • path - expanduser & abspath applied
  • multiple paths - like multiple string values, but normalized like above
  • “worker thread” class - give it a callable, it will return its return value, taking care of catching exceptions
  • running external apps via run_command
  • Bunch object for grouping arbitrary data into a single object, or warping dictionary as an object (d[key] => d.key)
  • cached_property decorator
  • formatted logging of arbitrary command-line - if you have a command-line to format, we have a function for that
  • fetch data from a given URL
  • load data from YAML or JSON file or string
  • write data structures as a YAML of JSON
  • pattern maps
  • waiting for things to finish
  • creating XML elements
  • checking whether external apps are available and runnable

Tool

Todo

Features yet to describe:

  • reusable heart of gluetool
  • config file on system level, user level or in a local dir

Todo

Features yet to describe:

  • custom pylint checkers
  • option names
  • shared function definitions