qdb - an online database for CPAN's instant mirroring
instant mirroring clients use CPAN's update history,
provided in CPAN's
The qdb toolset provides the same information in an online database.
Conceptually, CPAN's update history is a ordered list of events :
⟦ ⟦ tag, type, path ⟧ ... ⟧
- tag is an element of some totally ordered set ;
tag orders the list.
- type is either
- path is a file-path in CPAN,
- in the list,
the order of
delete events matters ;
so we simply require each tag to be unique.
- for each path, only the last path-event matters,
so we may require each path to be unique.
- when a ⟦type,path⟧-event is added to the list,
it must get a tag that is greater than max(tag),
and it may replace any previous path-events.
At the moment, CPAN's instant mirroring uses
At the moment, qdb does this :
RECENT-files contain CPAN's update history.
- The files (± 85MB) are distributed in CPAN.
- For tags,
RECENT-events have epoch,
a real that is loosely based on the event's timestamp.
- Because CPAN's history is
big, it is chopped up
in a fixed set of chunks :
- recent events are kept in small chunks that change often,
- old events are kept in big chunks that change infrequently.
- Now and then, events are carefully moved (copied, and later deleted)
from one chunk to the next.
- Chunks always overlap, and are never empty.
- Every chunk contains extensive meta-info to link it to the next chunk.
- In CPAN, the
RECENT-files are special in that their
updates do not appear in the
- Maintaining (and using)
RECENT-files is not trivial ;
it is hard to keep the set of chunks complete and consistent.
At the moment, qdb has this :
- qdb provides an online database, containing CPAN's history.
- for tags, qdb uses id : an auto-increment unique number ;
- in the database, path is unique ; scheme :
path text NOT NULL UNIQUE ON CONFLICT REPLACE
The database guarantees that id is strictly increasing
and there is at most one path-event.
- qdb provides access to simple queries like :
Basicly, that's enough to keep an instant mirroring client happy.
last_id : returns MAX(id)
from $n :
returns a bunch of events with id≥$n
The qdb toolset can be deployed in various ways,
from low to high impact.
- For the server there is a tool to initialise (and update) the database
from a set of
- A server provides access by running a special daemon and/or
by installing a simple cgi-script.
- The daemon uses a simple question/answer-protocol ;
the answers are json texts.
- The cgi-script parses environment variable
and returns a json text.
- A client gets info by talking to the daemon
and/or issuing http-requests to the cgi-script.
- qdb can be deployed on PAUSE or CPAN-master without
any change to the rrr-toolset :
- a cronjob to keep de database up-to-date,
- the cgi-script to provide access.
- At the other end of the spectrum, qdb could eventually
This would require a minor change in rrr-server
and allow a nice cleanup of rrr-client and iim.
- Many in-between scenario's en transitions can be considered.
qdb : documentation, source
Tue Dec 27 15:11:50 CET 2016