gluetool
features¶
A comprehensive list of gluetool
features, available helpers, tricks and tips. All the ways gluetool
have to help module developers.
Core¶
Module and gluetool
configuration¶
Configuration of gluetool
and every module is gathered from different sources of different priorities, and merged into a single store, accessible by option()
method. Configuration from later sources replaces values set by earlier sources, with lower priority. That way it is possible to combine multiple configuration files for a module, e.g. a generic site-wide configuration, with user-specific configuration overriding the global settings. Options specified on a command-line have the highest priority, overriding all configuration files.
Consider following example module - it has just a single option, whom
, whose value is logged in a form of greeting. The option has a default value, unknown being
:
from gluetool import Module
class M(Module):
name = 'dummy-module'
options = {
'whom': {
'default': 'unknown being'
}
}
def execute(self):
self.info('Hi, {}!'.format(self.option('whom')))
With a configuration file, ~/.gluetool.d/config/dummy-module
, you can change the value of whom
:
[default]
whom = happz
As you can see, configuration file for dummy-module
is loaded and option()
method returns the correct value, happz
.
Options specified on a command-line are merged into the store transparently, without any additional action necessary:
Todo
- re-record video because of
name
=>whom
seealso
:- options definitions
See also
- Configuration files
- to see what configuration files are examined.
Configuration files¶
For every module - including gluetool
itself as well - gluetool
checks several possible sources of configuration, merging all information found into a single configuration store, which can be queried during runtime using option()
method.
Configuration files follow simple INI format, with a single section called [default]
, containing all options:
[default]
option-foo = value bar
Warning
Options can have short and long names (e.g. -v
vs. --verbose
). Configuration files are using only the long option names to propagate their values to gluetool
. If you use a short name (e.g. v = yes
), such setting won’t affect gluetool
behavior!
These files are checked for gluetool
configuration:
/etc/gluetool.d/gluetool
~/.gluetool.d/gluetool
./.gluetool.d/gluetool
- options specified on a command-line
These files are checked for module configuration:
/etc/gluetool.d/config/<module name>
~/.gluetool.d/config/<module name>
./.gluetool.d/config/<module name>
- options specified on a command-line
If you’re using a tool derived from gluetool
, it may add its own set of directories, e.g. using its name insead of gluetool
, but lists mentioned above should be honored by such tool anyway, to stay compatible with the base gluetool
.
It is possible to change the list of directories, using --module-config-path
option, the default list mentioned above is then replaced by directories provided by this option.
Todo
seealso
:- option definitions
See also
- Module and gluetool configuration
- for more information on configuration handling.
- Module aliases
- for more information on module names and how to rename them
Module aliases¶
Each module has a name, as set by its name
class attribute, but sometimes it might be good to use the module under another name. Remember, the module configuration is loaded from files named just like the module, and if there’s a way to “rename” module when used in different pipelines, user might use different configuration files for the same module.
Consider following example module - it has just a single option, whom
, whose value is logged in a form of greeting:
from gluetool import Module
class M(Module):
name = 'dummy-module'
options = {
'whom': {}
}
def execute(self):
self.info('Hi, {}!'.format(self.option('whom')))
With the following configuration, ~/.gluetool.d/config/dummy-module
, it will greet your users in a more friendly fashion:
[default]
whom = handsome gluetool user
For some reason, you might wish to use the module in another pipeline, sharing the configuration between both pipelines, but you want to change the greeted entity. One option is to use a command-line option, which overrides configuration files but that would make one of your pipelines a bit exceptional, having some extra command-line stuff. Other way is to tell gluetool
to use the module but give it a different name. Add the extra configuration file for your “renamed” module, ~/.gluetool.d/config/customized-dummy-module
:
[default]
whom = beautiful
Module named customized-dummy-module:dummy-module
does not exist but this form tells gluetool
it should create an instance of dummy-module
module, and name it customized-dummy-module
. This is the name used to find and load module’s configuration.
You may combine aliases and original modules as much as you wish - gluetool
will keep track of names and the actual modules, and it will load the correct configuration:
Todo
- re-record video because of
name
=>whom
Evaluation context¶
gluetool
and its modules rely heavily on separating code from configuration, offloading things to easily editable files instead of hard-coding them into module sources. Values in configuration files can often be seen as templates, which need a bit of “polishing” to fill in missing bits that depend on the actual state of a pipeline and resources it operates on. To let modules easily participate and use information encapsulated in other modules in the pipeline, gluetool
uses concept called evaluation context - a module can provide a set of variables it thinks might be interesting to other modules. These variables are collected over all modules in the pipeline, and made available as a “context”, mapping of variable names and their values, which is a form generaly understood by pretty much any functionality that evaluates things, like templating engines.
To provide evaluation context, module has to define a property named eval_context
. This property should return a mapping of variable names and their values.
For example:
from gluetool import Module
from gluetool.utils import render_template
class M(Module):
name = 'dummy-module'
@property
def eval_context(self):
return {
'FOO': id(self)
}
def execute(self):
context = self.shared('eval_context')
self.info('Known variables: {}'.format(', '.join(context.keys())))
message = render_template('Absolutely useless ID of this module is {{ FOO }}', **context)
self.info(message)
It provides an interesting information to other modules - named FOO
- for use in templates and other forms of runtime evaluation. To get access to the global context, collected from all modules, shared function eval_context
is called.
Expected output:
[12:48:41] [+] [dummy-module] Known variables: FOO, ENV
[12:48:41] [+] [dummy-module] Absolutely useless ID of this module is 139695598692432
Note
Modules are asked to provide their context in the same order they are listed in the pipeline, and their contexts are merged, after each query, into a single mapping. It is therefore easy to overwrite variables provided by modules that were queried earlier by simply providing the same variable with a different value.
Note
It is a good practice to prefix names of provided variables, to make them module specific and avoid confusion when it comes to names that might be considered too generic. E.g. variable ID
is probably way too universal - is it a user ID, or a task ID? Instead, USER_ID
or ARTIFACT_OWNER_ID
is much better.
Todo
seealso
:- rendering templates
Long and short option names¶
When specifying options on a command-line, each option can be set using its name: --foo
for option named foo
. Historicaly, it is also common to use “short” variants of option names, using just a single character. For example, --help
and -h
control the same thing. By default, each option defined by a module is a “long” one, suitable for use in a --foo
form. If developer wishes to enable short form as well, he can simply express this wish by using both variants when defining the option, grouping them in a tuple.
Consider following example module - it has just a single option, whom
, whose value is logged in a form of greeting. It is possible to use --whom
or -w
to control the value.
from gluetool import Module
class M(Module):
name = 'dummy-module'
options = {
('w', 'whom'): {}
}
def execute(self):
self.info('Hi, {}!'.format(self.option('whom')))
Note
Configuration files deal with “long” option names only. I.e. whom = handsome
will be correctly propagated into module’s configuration store while w = handsome
won’t.
Todo
Features yet to describe:
- system-level, user-level and local dir configs
- configurable list of module paths (with default based on sys.prefix)
- dry-run support
- controled by core
- module can check what level is set, and take corresponding action. core takes care of logging
- exception hierarchy
- hard vs soft errors
- chaining supported
- custom sentry fingerprint and tags
- Failure class to pass by internally
- processes config file, command line options
- argparser to configure option
- option groups
- required options
- note to print as a part of help
- shared functions
- overloaded shared
- require_shared
- module logging helpers
- sanity => execute => destroy - pipeline flow
- failure access
- module discovery mechanism
Help¶
gluetool
tries hard to simplify writing of consistent and useful help for modules, their shared functions, options and, of course, a source code. Its markup syntax of choice is reStructured (reST), which is being used in all docstrings. Sphinx is then used to generate documentation from documents and source code.
Module help¶
Every module supports a command-line option -h
or --help
that prints information on module’s usage on terminal. To provide as much information on module’s “public API”, several sources are taken into account when generating the overall help for the module. Use of reST syntaxt is supported by each of them, that should allow authors to highligh important bits or syntax.
- module’s docstring
- Developer should describe module’s purpose, use cases, configuration files and their syntax. Bear in mind that this is the text an end user would read to find out how to use the module, how to configure it and what they should expect from it. Feel free to use reST to include code blocks, emphasize importat bits and so on.
- module’s options
- Every option module has should have its own help set, using
help
key. These texts are gathered. - module’s shared functions
- If the module provides shared functions, their signatures and help texts are gathered.
- module’s evaluation context
- If the module provides an evaluation context, description for each of its variables is extracted.
All parts are put together, formatted properly, and printed out to terminal in response to --help
option.
Example:
from gluetool import Module
class M(Module):
"""
This module greets its user.
See ``--whom`` option.
"""
name = 'dummy-module'
options = {
'whom': {
'help': 'Greet our caller, whose NAME we are told by this option.',
'default': 'unknown being',
'metavar': 'NAME'
}
}
shared_functions = ('hello',)
def hello(self, name):
"""
Say "Hi!" to someone.
:param str name: Name of entity we're supposed to greet.
"""
self.info('Hi, {}!'.format(name))
@property
def eval_context(self):
__content__ = {
'NAME': 'Name of entity this module should greet.'
}
return {
'NAME': self.option('whom')
}
def execute(self):
self.hello(self.option('whom'))
Todo
- run example with a
gluetool
supporting eval context help seealso
:- module options
- shared functions
- shared functions help
- eval context
- help colors
Todo
Features yet to describe:
- modules, shared functions, etc. help strings generated from their docstrings
- options help from their definitions in self.options
- RST formatting supported and evaluated before printing
- colorized to highlight RST
- keeps track of terminal width, tries to fit in
Logging¶
Early debug messages¶
Default logging level is set to INFO
. While debugging actions happening early in pipeline workflow, like module discovery and loading, it may be useful to enable more verbose logging. Unfortunatelly, this feature is controlled by a debug
option, and this option will be taken into account too late to shed light on your problem. For that case, it is possible to tell gluetool
to enable debug logging right from its beginning, by setting an environment variable GLUETOOL_DEBUG
to any value:
export GLUETOOL_DEBUG=does-not-matter
gluetool -l
As you can see, gluetool
dumps much more verbose logging messages - about processing of options, config files and other stuff - on terminal with the variable set.
Note
You can set the variable in any way supported by your shell, session or environment in general. The only important thing is that such variable must exist when gluetool
starts.
Logging of structured data¶
To format structured data, like lists, tuples and dictionaries, for output, use gluetool.log.format_dict()
Example:
import gluetool
print gluetool.log.format_dict([1, 2, (3, 4)])
Output:
[
1,
2,
[
3,
4,
]
]
To actually log structured data, the gluetool.log.log_dict()
helper is a nice shortcut.
Example:
import gluetool
logger = gluetool.log.Logging.create_logger()
gluetool.log.log_dict(logger.info, 'logging structured data', [1, 2, (3, 4)])
Output:
[14:43:03] [+] logging structured data:
[
1,
2,
[
3,
4
]
]
The first parameter of log_dict
expects a callback which is given the formatted data to actually log them. It is therefore easy to use log_dict
on every level of your code, e.g. in methods of your module, just give it proper callback, like self.info
.
Todo
seealso
:- logging helpers
- connecting loggers
See also
- Logging of unstructured blobs of text
- to find out how to log text blobs.
gluetool.log.format_dict()
,gluetool.log.log_dict()
- for developer documentation.
Logging of unstructured blobs of text¶
To format a “blob” of text, without any apparent structure other than new-lines and similar markings, use gluetool.log.format_blob()
:
It will preserve text formatting over multiple lines, and it will add borders to allow easy separation of the blob from neighbouring text.
To actually log a blob of text, gluetool.log.log_blob()
is a shortcut:
The first parameter of log_blob
expects a callback which is given the formatted data to actually log them. It is therefore easy to use log_blob
on every level of your code, e.g. in methods of your module, just give it proper callback, like self.info
.
Todo
seealso
:- logging helpers
- connecting loggers
See also
- Logging of structured data
- to find out how to log structured data.
gluetool.log.format_blob()
,gluetool.log.log_blob()
- for developer documentation.
Logging of XML elements¶
To format an XML element, use gluetool.log.format_xml()
:
It will indent nested elements, presenting the tree in a more readable form.
To actually log an XML element, gluetool.log.log_xml()
is a shortcut:
The first parameter of log_xml
expects a callback which is given the formatted data to actually log them. It is therefore easy to use log_xml
on every level of your code, e.g. in methods of your module, just give it proper callback, like self.info
.
Todo
seealso
:- logging helpers
- connecting loggers
See also
- Logging of structured data
- to find out how to log structured data.
gluetool.log.format_blob()
,gluetool.log.log_blob()
- for developer documentation.
Object logging helpers¶
Note
When we talk about logger, we mean it as a description - an object that has logging methods we can use. It’s not necessarilly the instance of logging.Logger
- in fact, given how logging part of gluetool
works, it is most likely it’s an instance of gluetool.logging.ContextAdapter
. But that is not important, the API - logging methods like info
or error
are available in such “logger” object, no matter what its class is.
Python’s logging system provides a log function for each major log level, usually named by its corresponding level in lowercase, e.g. debug
or info
. These are reachable as methods of a logger (or logging context adapter) instance. If you have a class which is given a logger, to ease access to these methods, it is possible to “connect” the logger and your class, making logger’s debug
& co. direct members of your objects, allowing you to call self.debug
, for example.
Example:
from gluetool.log import Logging, ContextAdapter
logger = ContextAdapter(Logging.create_logger())
class Foo(object):
def __init__(self, logger):
logger.connect(self)
Foo(logger).info('a message')
Output:
[10:01:15] [+] a message
All standard logging method debug
, info
, warn
, error
and exception
are made available after connecting a logger.
Todo
seealso
:- context adapter
See also
logging.Logger.debug()
- for logging methods.
Todo
Features yet to describe:
- clear separation of logging records, making it visible where each of them starts and what is a log message and what a logged blob of command output
- default log level controlled by env var
- warn(sentry=True)
- verbose, readable, formatted traceback logging
- using context adapters to add “structure” to loged messages
- colorized messages based on their level
- optional “log everything” dump in a file
- correct and readable logging of exception chains
Colorized output¶
gluetool
uses awesome colorama library to enhance many of its outputs with colors. This is done in a transparent way, when developer does not need to think about it, and user can control this feature with a single option.
Control¶
Color support is disabled by default, and can be turned on using --color
option:
If colorama
package is not installed, color support cannot be turned on. If user tries to do that, gluetool
will emit a warning:
Note
As of now, colorama
is gluetool
‘s hard requirement, therefore it should not be possible - at least out of the box - to run gluetool
wihout having colorama
installed. However, this may change in the future, leaving this support up to user decision.
To control this feature programatically, see gluetool.color.switch()
.
Todo
seealso
:- how to specify options
Colorized logs¶
Messages, logged on the terminal, are colorized based on their level:
DEBUG
log level inherits default text color of your terminal, while, for example, ERROR
is highlighted by being red, and INFO
level is printed with nice, comforting green.
Todo
seealso
:- logging
Colorized help¶
gluetool
uses reStructuredText (reST) to document modules, shared functions, opitons and other things, and to make the help texts even more readable, formatting, provided by reST, is enhanced with colors, to help users orient and focus on important information.
Todo
seealso
:- generic help
- module help
- option help
Colors in templates¶
Color support is available for templates as well, via style
filter.
Example:
import gluetool
gluetool.log.Logging.create_logger()
gluetool.color.switch(True)
print gluetool.utils.render_template('{{ "foo" | style(fg="red", bg="green") }}')
See also
- Rendering templates
- for more information about rendering templates with
gluetool
.
Sentry integration¶
gluetool
integrates easily with Sentry platform, simplifying the collection of trouble issues, code crashes, warnings and other important events your deployed code produces. This integration is optional - it must be explicitly enabled - and transparent - it is not necessary to report common events, like exceptions.
When enabled, every unhandled exception is automatically reported to Sentry. Helpers for explicit reporting of handled exceptions and warnings are available, as well as the bare method for reporting arbitrary events.
Control¶
Sentry integration is controlled by environmental variables. It must be possible to configure itilable even before gluetool
has a chance to process given options. To enable Sentry integration, one has to set at least SENTRY_DSN
variable:
export SENTRY_DSN="https://<key>:<secret>@sentry.io/<project>"
This variable tells Sentry-related code where it should report the events. Without this variable set, Sentry integration is disabled. All relevant functions still can be called but do not report any events to Sentry, since they don’t know where to send their reports.
See also
- About the DSN
- for detaield information on Sentry DSN and their use.
gluetool.sentry module
- for developer documentation.
Sentry tags & environment variables¶
Sentry allows attaching “tags” to reported events. To use environment variables as such tags, set SENTRY_TAG_MAP
variable. It lists comma-separated pairs of names, tag and its source variable.
export SENTRY_TAG_MAP="username=USER,hostname=HOSTNAME"
Should there be an event to report, integration code will attach 2 labels to it, username
and hostname
, using environmen variables USER
and HOSTNAME
respectively as source of values.
See also
- Tagging Events
- for detailed information on event tags.
Logging of submitted events¶
Integration code is able to log every reported event. To enable this feature, simply set SENTRY_BASE_URL
environment variable to URL of the project gluetool
is reporting events to. While SENTRY_DSN
controls the whole integration and has its meaning within Sentry server your gluetool
runs report to, SENTRY_BASE_URL
is used only in a cosmetic way and gluetool
code adds ID of reported event to it. The resulting URL, if followed, should lead to your project and the relevant event.
As you can see, the exception, raised by gluetool
when there were no command-line options telling it what to do, has been submitted to Sentry, and immediately logged, with ERROR
loglevel.
See also
- Sentry - Control
- for more information about Sentry integration.
Warnings¶
By default, only unhandled exceptions are submitted to Sentry. it is however possible, among others, to submit warnings, e.g. in case when such warning is good to capture yet it is not necessary to raise an exception and kill the whole gluetool
pipeline. For that case, warn
logging method accepts sentry
keword parameter, which, when set to True
, uses Sentry-related code to submit given message to the configured Sentry instance. It is also always logged like any other warning.
Example:
from gluetool.log import Logging
logger = Logging.create_logger()
logger.warn('foo', sentry=True)
Output:
[17:16:50] [W] foo
Todo
- video
See also
- Object logging helpers
- for more information on logging methods.
gluetool.log.warn_sentry()
- for developer documentation.
Todo
Features yet to describe:
- all env variables are attached to events (breadcrumbs)
- logging records are attached to events (breadcrumbs)
- URL of every reported event available for examination by code
- soft-error tag for failure.soft errors
- raised exceptions can provide custom fingerprints and tags
- submit_exception and submit_warning for explicit submissions
- logger.warn(sentry=True)
Utils¶
Rendering templates¶
gluetool
and its modules make heavy use of Jinja2 templates. To help with their processing, it provides gluetool.utils.render_template()
helper which accepts both raw string templates and instances of jinja2.Template
, and renders it with given context variables. Added value is uniform logging of template and used variables.
Example:
import gluetool
gluetool.log.Logging.create_logger()
print gluetool.utils.render_template('Say hi to {{ user }}', user='happz')
Output:
Say hi to happz
See also
- Jinja2 templates
- for information about this fast & modern templating engine.
gluetool.utils.render_template()
- for developer documentation.
- Colors in templates
- for using colors in templates
Normalize URL¶
URLs, comming from different systems, or created by joining their parts, might contain redundant bits, duplicities, multiple ..
entries, mix of uppercase and lowercase characters and similar stuff. Such URLs are not verry pretty. To “prettify” your URLs, use gluetool.utils.treat_url()
:
For example:
from gluetool.log import Logging
from gluetool.utils import treat_url
print treat_url('HTTP://FoO.bAr.coM././foo/././../foo/index.html')
Output:
http://foo.bar.com/foo/index.html
Todo
Features yet to describe:
- dict_update
- converting various command-line options to unified output
- boolean switches via normalize_bool_option
- multiple string values (–foo A,B –foo C => [A,B,C])
- path - expanduser & abspath applied
- multiple paths - like multiple string values, but normalized like above
- “worker thread” class - give it a callable, it will return its return value, taking care of catching exceptions
- running external apps via run_command
- Bunch object for grouping arbitrary data into a single object, or warping dictionary as an object (d[key] => d.key)
- cached_property decorator
- formatted logging of arbitrary command-line - if you have a command-line to format, we have a function for that
- fetch data from a given URL
- load data from YAML or JSON file or string
- write data structures as a YAML of JSON
- pattern maps
- waiting for things to finish
- creating XML elements
- checking whether external apps are available and runnable
Tool¶
Todo
Features yet to describe:
- reusable heart of gluetool
- config file on system level, user level or in a local dir
Todo
Features yet to describe:
- custom pylint checkers
- option names
- shared function definitions