Overview of desire, a software knowledge and distribution system

January 16, 2010, 19:37

Table of Contents

1 About this document

This document has a canonical location:

http://www.feelingofgreen.ru/shared/src/desire/doc/overview.html

or, for the Org-mode source:

http://www.feelingofgreen.ru/shared/src/desire/doc/overview.org

In case you are looking for a short, take-no-prisoners introduction, you may consider the crash tutorial:

http://www.feelingofgreen.ru/shared/src/desire/doc/tutorial.html

2 What is desire (supposed to be)?

Desire is a source code management substrate. It once was called a "package management substrate", but no longer, because, while it does some things a package manager like APT or YUM would do, the actual packaging part is not represented.

Desire probably is best described as a sum of its current components:

  • a DSL enumerating software module upstreams and facilitating a global namespace for described objects, augmented with rich supporting facilities;
  • a nascent VCS management layer1;
  • a repository format unifier based on the above, facilitating redistribution of upstream source code in a common DVCS format;
  • a set of inter-module dependency deducers2 for various situations, but most importantly allowing for automated end-user-oriented module retrieval and installation;
  • a number of auxiliary tools based on that substrate:
    • a semi-automated clbuild project definition importer;
    • a fairly automated packager for ASDF-INSTALL;
    • an automated patch applicator providing a set of XCVBified projects;
    • a buildbot with a master/slave architecture and a web interface supporting both static and dynamic page generation.

If this sounds like running in all directions, it probably is, as the goal of desire is to become a substrate supporting all source-code-related activities – module enumeration, (re-)distribution, upstream tracking, release management, packaging, testing etc.

3 What can desire do for you now?

3.1 End user

You can use desire in the same manner as ASDF-INSTALL, but in a somewhat more friendly manner:

  • no non-essential packages need to be installed, because desire has a concept of central system of a module3;
  • you can ask it about available modules in an apropos-like fashion, because of the availability of an explicitly declared nomenclature;

Additionally, you can use it to track non-lisp upstreams – without dependency handling, for obvious reasons.

3.2 Upstreams

There are two modes of operation, a participating one and a non-participating one.

The latter requires almost no involvement on upstream's part – not even use of desire, whereas the former assumes control over the definition of the corresponding part of the global upstream namespace.

In either case you get:

3.2.1 Non-participating upstreams

You don't have to do anything, but perhaps ask the desire maintainers to record definitions of modules released by you and of their dependencies, that is, if they are not recorded already.

3.2.2 Participating upstreams a.k.a. wishmasters

You get to maintain your portion of global definitions and a set of publicly accessible git repositories.

In return, you get following things:

  • easy upstream dependency tracking (really, an end-user mode feature);
  • programmatic API for certain aspects of git repository operation;
  • semi-automatic ASDF-INSTALL exporter;
  • possibility to run a local instance of the buildbot.

3.3 Tinkerer

You know you're in this category when you want to do things like:

  • programmatically generate/maintain/test changes/branches across a set of modules, like the automated XCVBifier does, perhaps for facilitating coordinated release making;
  • programmatically populate the definition namespace (perhaps for a web-sphere where people could register and manipulate their projects);
  • extend the buildbot infrastructure, perhaps adding things like automated package generation phase;

I won't go into details at this point, but I believe that desire provides enough foothold to make all that feasible, if not easy-ish.

Like with any underspecified area, desire employs a degree of heuristics here and there -- and that's where we need specifications to make things predictable and reliable.

4 Installation

The ultimate goal of actions described in this section is to provide the user with a REPL fully prepared for input of desire commands.

4.1 Requirements

Absolute requirements at this point are a POSIX shell, git and SBCL.

4.2 Bootstrap and initialisation

Besides the obvious download and compilation/load steps, desire requires an initialisation procedure parametrised with a source code storage area, in order to become operational.

All these steps can be carried out:

  • manually;
  • in a semi-automated manner, with assist of ASDF-INSTALL;
  • in a completely automated manner, using the bootstrap script.

Before we proceed, a crucial concept of the storage area must be discussed.

The storage area is a writable directory whose writable subdirectories fall into one of the three categories:

  • module repositories maintained by desire through git;
  • foreign VCS subroots containing transitory repositories in foreign formats;
  • metadata repositories used for desire's internode communication and information tracking.

For the time being, the system-wide policy for maintenance of these storage roots is not devised yet, so they are effectively per-user.

Unless you go the fully-automated bootstrap way, the first time you run desire you will need to prepare an empty writeable directory.

4.2.1 Manual installation

root="/absolute/path/to/storage-area/"

The remaining of the shell snippet is designed to be pastable into a shell:

mkdir ${root}
cd ${root}
dependencies="alexandria asdf cl-fad desire executor pergamum iterate"
for dep in ${dependencies}
do
        git clone -o git.feelingofgreen.ru git://git.feelingofgreen.ru/$dep
        # alternative for behind-HTTP-proxy world:
        # git clone -o git.feelingofgreen.ru http://git.feelingofgreen.ru/shared/src/$dep/.git/
        ln -s ${root}/$dep/$dep.asd
done
sbcl --no-userinit --no-sysinit --eval '(load (compile-file "asdf/asdf.lisp"))' \
     --eval "(require :desire)" \
     --eval "(desire:init \"${root}\")" \
     --eval "(in-package :desire)"

4.2.2 Semi-automated installation

This mode is intended to use ASDF-INSTALL, but alas it doesn't work, as you need to be in SBCL without ASDF loaded, because desire depends on a development version of ASDF which is not yet integrated into SBCL.

4.2.3 Fully automated bootstrap

Desire includes a booststrap script which is located at:

http://www.feelingofgreen.ru/shared/src/desire/climb.sh

Download that script and run it like that (the trailing slash is important):

sh climb.sh /absolute/path/to/storage-root/

The reference for various controls follows.

4.2.4 Bootstrap script reference

This script performs following operations:

  • use git to download modules desire depends on, placing them in the provided storage location;
  • load and perform an initial setup of desire;
  • optionally install a desired module and its dependencies;
  • optionally evaluate an expression in the resulting environment.

Invoke it like this:

  climb.sh [OPTION]... [STORAGE-ROOT]
Bootstrap, update or perform other actions on a desire installation
in either STORAGE-ROOT, or a location specified in ~/.climb-root

  -u          Self-update and continue processing other options, using
                the updated version.
  -n HOSTNAME Use HOSTNAME as a bootstrap node.
                HOSTNAME must refer to a node participating in desire protocol.
  -b BRANCH   Check out BRANCH of desire other than 'master'.
  -t BRANCH   Check out BRANCH of metastore on the bootstrap node other than
                the default.  The default is the same as the used branch
                of desire.
  -m MODULE   Retrieve MODULE, once ready.
  -s SYSTEM   Install or update the module relevant to SYSTEM, then load it.
  -a APP      Load system containing APP, as per -s, then launch it.
  -x EXPR     Evaluate an expression, in the end of it all.
  -d          Enable debug optimisation of Lisp code.
  -n          Disable debugger, causing desire dump stack and abort on errors,
                instead of entering the debugger.
  -e          Enable explanations about external program invocations.
  -v          Crank up verbosity.
  -V          Print version.
  -h          Display a help message.

As step zero, when the -u switch is provided, climb.sh is updated using wget from the canonical location at http://www.feelingofgreen.ru/shared/src/desire/climb.sh, and then normal processing is continued, using the updated version.

During the first step, a storage root location is either created or validated. The pre-existing storage root must be a writable directory containing a writable 'git' subdirectory.

When STORAGE-ROOT is not specified, ~/.climb-root is looked up for an absolute pathname referring to a valid storage location. If this condition is met, that directory is accepted as STORAGE-ROOT, otherwise an error is signalled.

When STORAGE-ROOT is specified, it must be either an absolute pathname referring to a valid storage location, or it must denote a non-occupied filesystem location, with a writable parent directory.

During the second step, desire and its dependencies are either retrieved, or updated, in the case when they are already present in STORAGE-ROOT.

Next, a specific branch of desire is checked out, configurable with the -d option and defaulting to "master".

Further, the -n and -t options alter, correspondingly, the hostname of the desire node used for bootstrap, and a branch of that node's metastore to use. These options default to git.feelingofgreen.ru and the name of the branch of desire, accordingly.

During the next step a lisp is started and desire initialisation is attempted, with the above determined values of hostname and metastore branch.

Once the initialisation is complete, MODULE, SYSTEM and APP provide optional convenience shortcuts for module installation, system loading and application launching. Any of these can be omitted, as the required information is easily deduced. Note that the more granular objects determine the objects of lower granularity.

After all these steps, EXPR is executed, if it was provided.

5 Workflows

This section assumes that desire was already made operational by either means described in the previous chapter.

5.1 Re-initialisation

The INIT procedure ensures that repositories constituting your desire node are in working order.

Depending on whether you run a well-known desire node (that is, a wishmaster) you need to provide the :AS keyword to INIT:

(init "/path/to/root/")                            ; for non-well-known mode

or:

(init "/path/to/root/" :as 'your-node-domain-name) ; for wishmaster mode

The specified root directory will contain all VCS-specific master localities, as well as anything module's post-install scripts choose to deliver. This pathname is available in *SELF*, using the ROOT reader.

Unless you already have a '.meta' module, an initial seed version will be downloaded for you. Currently the wishmaster chosen for this is git.feelingofgreen.ru.

This procedure also determines the available VCS tools, as well as conversion tools, and determines the set of accessible remotes.

Further, it scans the git locality for known modules, and makes their systems registered in the ASDF registry.

5.2 User aspect

Unless you happen to have some conversion tools, the set of modules available to your node is restricted to those available via git remotes.

The DESIRE function serves to either initially download or update a set of defined modules.

APROPOS-DESR and LIST-MODULES provide convenient knowledge base query facilities. For a wider set of functions, please see section 3.

5.3 Wishmaster aspect

From the wishmaster point of view the INIT function also does:

  • checks that the locally available set of modules covers every module that is claimed to be "well known" to be published by our distributor5, otherwise signalling an error
  • publishes the informations about non-released, converted modules in the gate remote's DEFINITIONS file
5.3.1 External executables required for module conversion

The conversion is performed by external programs:

  • darcs-to-git6
  • git cvs, debian package git-cvs
  • git svn, debian package git-svn

5.4 Extending definitions

ADD-MODULE and the accompanying reader macro #@"u://r.l/" is a one-stop point useful for manual extension of the set of known entities. The URI type of the URL must name to the VCS used at the given distribution point, that is one of 'git', 'http' (which actually means git+http), 'darcs', 'cvs' or 'svn'.

The required super-entities are either found among current definitions, or created on the spot.

SAVE-DEFINITIONS writes out changes into <value-of-(ROOT *SELF*)>/git/.meta/DEFINITIONS

6 Overview of terms

6.1 DEFINITIONS

This is the term you will often see mentioned, as it is central to the process of distribution of information about the domain.

For the moment it would suffice to say that this is a text file containing forms (which are READable using desire-provided readers) which carry information about terms covered in this section.

6.2 Distributors

The largest unit of software granularity is a distributor, which corresponds to an internet domain name7. Currently, they don't carry much information beyond just that.

Distributors contain one or more remotes, which correspond to points of distribution of sets of similarly-accessible modules.

6.2.1 Wishmasters

Wishmaster is a subtype of distributor which participates in the desire network, in the form of being recorded in DEFINITIONS, as well as by maintaining a gate (a subtype of remotes upon which we will elaborate further below). When applied to wishmasters, the virtue of being registered in some DEFINITIONS is otherwise referred to as property of being well-known.

Two important processes naturally growing out of this are:

  • exchange of DEFINITIONS, be it among well-known wishmasters, or as a process of informing a client node, and
  • upstream module conversion, which ends up populating the gate;

The upstream module conversion process serves to simplify life of average desire users by repackaging modules available through a variety of distribution means8 into a common DVCS format.

The wishmaster corresponding to the local desire instance is available through the variable called *SELF*.

6.3 Locations

Location is a general term for objects storing module repositories, and are either local, in which case they are localities, or non-local, in which case they are, unsurprisingly, /remotes/9.

6.3.1 Remotes

The concept of remote serves as a point of distribution for a group of modules.

In general, all remotes carry the following information:

  • version control system type (git, darcs, cvs or svn),
  • transport type (native, http or rsync),
  • simple pattern for matching module paths on the distributor,
  • an internet port number, and
  • some additional quirks necessary to access the remote repository;
6.3.1.1 Non-gate remotes

Non-gate remotes represent distribution points of non-wishmasters, that is, nodes not participating in the desire protocol.

6.3.1.2 Gate remotes, or gates

Gate remotes, or gates are special remotes which are instrumental to participation in the desire protocol. They carry a special module called .meta, which records the containing distributor's idea about the domain in aforementioned DEFINITIONS.

The information in this file is subject to propagation in the network of hosts participating in the desire protocol.

Gate remotes have a second purpose: as parts of well-known wishmasters they serve for redistribution of modules converted by those wishmasters into a single repository format, currently git. The modules converted in such a way are advertised differently from those which are considered 'released' by the containing distributor.

6.3.2 Localities

Localities serve to express module storage on the local machine. Master localities (except the local gate, about which see below) are canonical transitory locations used for conversion of modules retrieved from remotes of specific VCS types.

The master git locality, also the local gate or a gate locality, is described in the next section, and is supposed to be a canonical storage location for all modules used on the desire node.

A scrupulous reader might note that the above description leaves open a possibility of existence for non-master localities. While it is true, and purposefully so, this concept is not currently employed.

6.3.3 Local gate, or gate locality

*SELF* always contains a special location, a local gate or a gate locality, which is both a remote and a locality and serves a threefold purpose:

  • storage of incoming modules for local consumption,
  • export of the aforementioned converted modules, and
  • distribution point for modules released by the local distributor.

Naturally, the last two points only apply to well-known wishmasters.

6.4 Modules

Modules represent versioned, atomic units of software, as released by the distributor, and, from the point of desire, carry the additional information necessary to complete the information provided by the less granular concepts to obtain the module from its containing remote.

Modules can be provided by several different remotes of different distributors. When the end-user requests retrieval of a module, gate remotes are preferred above others.

Locally, all incoming modules end up in the local gate, which always exists, nevermind the dominant operation mode of the desire node. Once in the local gate, the module becomes /locally available/10, and is made loadable through the preferred local system definition facility.

Locally available modules can be classified into one of the four categories, the first two of which are only applicable to well-known wishmasters:

  • released, for modules advertised through DEFINITIONS to be released through the distributor corresponding to *SELF*;
  • converted, for module originating elsewhere, but advertised in DEFINITIONS as being converted in the gate of *SELF*;
  • unpublished, modules accessible through the gate, yet unadvertised in DEFINITIONS;
  • hidden, modules physically present in the local gate, but made unavailable to anonymous remote clients11.

6.4.1 Pseudo-modules

Pseudo-modules refer to repositories stored in gates used for desire-specific information storage and exchange. Currently there are two common pseudo-modules:

  • .meta, the aforementioned domain-specific information junction point, and
  • .local-meta, a hidden repository used to store local information, which currently amounts to tracking unpublished and hidden modules.

6.5 Systems

Descending further down we meet systems. Systems are objects only meant to be relevant in the domain of Common Lisp software, and more precisely – to backend system definition facilities, such as ASDF, XCVB, Mudballs or others12.

The concept of system introduces inter-system dependencies, which cross module boundaries, producing inter-module dependencies.

Evidently, there can be several systems per module, and also those can be obscured from the end-user, either intentionally or by unfortunate accident13.

Desire handles all these complications and operates on the full inter-module dependency graph. It also doesn't store that graph anywhere, recomputing it, instead, every time a request for a module is performed.

It should be noted, that there is no requirement for modules to have systems, which enables end-users to manage (and provide) gittified non-Lisp software for local (and not-so-local) needs.

6.6 Applications

Applications are simple extensions of systems, providing some very preliminary support for launching programs. They are intended to simplify end-user experience by making requests such as "run climacs" expressible and actionable.

7 API (aka end-user interface)

7.1 Initial chores & storage location choice

init path &key as (default-wishmasters (list desr:*default-wishmaster*)) => <no values>

Initialise desire with PATH chosen as directory for storage of all VCS-specific locations.

When AS is non-NIL an attempt is made to establish an identity to a defined distributor named by the AS keyword.

This is performed by checking that the locally available set of modules covers every module that is claimed to be to be published by our distributor5, according to the local DEFINITIONS. When this check fails an error is signalled.

*self* => distributor

The local distributor set up during INIT, be it well-known or not.

root local-distributor => pathname

The root directory containing all VCS-specific locations of LOCAL-DISTRIBUTOR, chosen during INIT-time.

7.2 Performing knowledge base queries

distributor name &key (if-does-not-exist :error) => distributor
remote name &key (if-does-not-exist :error) => remote
module name &key (if-does-not-exist :error) => module
system name &key (if-does-not-exist :error) => system
app name &key (if-does-not-exist :error) => app
locality name &key (if-does-not-exist :error) => locality

Find objects by name.

name object => symbol

Yield object's name.

url remote-designator &optional module-specifier => string

Compute the URL of a module designated by MODULE-SPECIFIER contained a remote designated by REMOTE-DESIGNATOR.

apropos-desr string-designator &optional set-designator => <no values>

Like APROPOS, but finds objects from the domain of desire.

apropos-desr-list string-designator &optional set-designator => desirables

Like APROPOS-LIST, but finds objects from the domain of desire.

list-modules => <no values>

List all known modules, with some additional information.

module-hidden-p module-designator &optional (locality (gate *self*)) => boolean

See whether whether module designated by MODULE-DESIGNATOR is unavailable to anonymous remote clients.

module-present-p module-designator &optional (locality (gate *self*)) check-when-present-p (check-when-missing-p t) => boolean

Determine whether module designated by MODULE-DESIGNATOR is present in LOCALITY, which defaults to the local gate locality.

local-summary &optional (stream *standard-output*) => <no values>

Print a summary about modules within the local gate to STREAM.

module-best-remote module-designator &key (if-does-not-exist :error) => remote

Produce the remote, if any, which will be chosen to satisfy desires for module designated by MODULE-DESIGNATOR.

module-best-distributor module-designator &key (if-does-not-exist :error) => remote

Produce the distributor, if any, whose remote will be chosen to satisfy desires for module designated by MODULE-DESIGNATOR.

module-fetch-url module &key allow-self => string

Return the URL which is to be used while fetching MODULE, that is the location of MODULE in the preferred remote. When ALLOW-SELF is specified, and non-NIL, remotes within *SELF* are not discarded from consideration.

touch-module module => boolean, string

Try 'access' MODULE via its preferred remote and return whether the attempt was successful as the primary value, and the output of the toucher executable as the secondary value.

system-loadable-p system-designator &optional (locality (gate *self*)) => generalised-boolean

Determine whether system designated by SYSTEM-DESIGNATOR is loadable in LOCALITY, which defaults to the local gate locality.

7.3 Making wishes

desire module-name-or-names => boolean

Make modules with MODULE-NAME-OR-NAMES locally available.

add-module url &key module-name systemlessp (system-type desr:*default-system-type*) obtain => module

Define a new module, with download location specified by URL, and the module's name either deduced from the URL, or provided via MODULE-NAME.

When OBTAIN is non-NIL, the module is fetched after its definition is internalised.

update module-designator &optional (locality (gate self)) => <no values>

Update a module specified by MODULE-DESIGNATOR, possibly specifying the target LOCALITY.

make-module-unpublished module-designator &optional (locality (gate *self*))

Stop advertising MODULE in DEFINITIONS, without completely hiding it. If it is hidden, unhide it.

hide-module module-designator &optional (locality (gate *self*))

Stop advertising MODULE in DEFINITIONS, as well as make it inaccessible to general public through LOCALITY.

*fetch-errors-serious* => boolean

Whether to raise an error when external executables fail to fetch modules during DESIRE or UPDATE. Defaults to NIL.

*follow-upstream* => boolean

Whether tracking upstream should update HEAD. Defaults to T.

*dirty-repository-behaviour* => keyword

Whenever a dirty repository comes up in a situation which requires a clean one to proceed, do accordingly to the value of this variable:

  • :RESET - reset the dirty repository, losing unsaved changes,
  • :STASH - reset the dirty repository, stashing unsaved changes,
  • :ERROR - raise an error.

Defaults to :RESET.

7.3.1 Reader macros for add-module

Following reader macro is enabled by install-add-module-reader:

#@"u://r.l"
#@("u://r.l" &key name obtain)

7.4 Less frequently used functions

system-definition system repository-path &key (if-does-not-exist :error) => pathname

Return the pathname of the SYSTEM's definition.

clear-definitions => <no values>

Forget everything. A subsequent READ-DEFINITIONS will be instrumental to continue any productive use.

remove-remote remote-designator &key keep-modules => nil

Forget everything associated with a remote specified by REMOTE-DESIGNATOR, optionally, when KEEP-MODULES is non-NIL, keeping modules referred by it.

remove-module module-designator &key keep-localities => nil

Forget everything associated with module specified by MODULE-DESIGNATOR, including its systems and applications.

remove-system system-designator => nil

Forget everything associated with the system specified by SYSTEM-DESIGNATOR.

save-definitions &key seal => <no values>

Write out the current idea about the desire's domain into DEFINITIONS, optionally committing changes, when SEAL is non-NIL.

read-definitions &key (source self) (force-source (eq source self)) (metastore (meta-path)) => <no values>

Append definitions currently available in METASTORE to the current idea about desire's domain.

8 Shortcomings

Some known problems:

  • SBCL-only
  • ASDF-only
  • Linux-only (might work on other unices)
  • has a non-trivial amount of CL library dependencies, half of them not exactly being common
  • calls out to an obscene amount of external executables, thereby only being able to guess about failure reasons

Footnotes:

1 Git-centric.

2 Three of them: full information, incremental and an ASDF-INSTALL-like post-factum one.

3 It's a heuristic, if you are curious, but it works pretty well.

4 Intermittent availability only, as of now.

5 This is tied to the concept of well known release locations and differs from the set of modules converted and reexported in the wishmaster process.

6 Available through git://github.com/purcell/darcs-to-git.git/

7 Actually, sometimes a group of domain names, like in case of sourceforge.

8 Currently supported release mechanisms are: git, darcs, cvs, svn and tarballs. Additionally, this is extended by some transport variety, like, for example, rsync.

9 The term is borrowed from the git terminology.

10 As per MODULE-LOCALLY-PRESENT-P.

11 In git this is accomplished by ensuring that the relevant repository lacks a .git/git-daemon-export-ok file.

12 Currently, the only backend system implemented is ASDF.

13 Recovering such hidden systems complicates construction of full dependency graph in case of ASDF.

Author: Samium Gromoff

Date: 2011-01-27 15:07:31 MSK

HTML generated by org-mode 7.4 in emacs 23