Operations and troubleshooting
This page covers the operational commands in DRAKKAR: database preparation, configuration, status inspection, logging, result transfer, output layout, and common recovery tasks.
Operations overview
Command |
Purpose |
Typical use |
|---|---|---|
|
Install or update supported annotation database releases. |
Prepare KEGG, CAZy, PFAM, AMR, or VFDB resources before annotation. |
|
View or edit the installed workflow configuration. |
Inspect or change database paths and default settings. |
|
Inspect workflow metadata and Snakemake logs. |
Diagnose failed runs, locked directories, and progress state. |
|
Show rule and sample progress for a workflow run. |
Monitor the latest run, a selected output directory, or one metadata YAML. |
|
Transfer selected outputs by SFTP while preserving structure. |
Move results from cluster storage to long-term or collaborator storage. |
|
Remove a Snakemake lock from a broken output directory. |
Recover after interrupted runs. |
|
Reinstall DRAKKAR from the Git repository in the current environment. |
Refresh the installed CLI and workflow package. |
See also Snakemake and SLURM management for the Snakemake and SLURM override flags available on every workflow command.
Database
Installs or updates one managed annotation database release at a time. This is
a maintenance workflow and is not triggered by drakkar complete.
Supported database subcommands:
kegg(alias:kofams)cazypfamvfdbamr
Examples:
$ drakkar database amr --directory /projects/alberdilab/data/databases/drakkar/amr --version 2025-07-16.1
$ drakkar database kegg --directory /projects/alberdilab/data/databases/drakkar/kofams --version 2026-02-01 --set-default
$ drakkar database kegg --directory /projects/alberdilab/data/databases/drakkar/kofams --version 2026-02-01 --download-runtime 180
$ drakkar database cazy --directory /projects/alberdilab/data/databases/drakkar/cazy --version V14 --set-default
$ drakkar database pfam --directory /projects/alberdilab/data/databases/drakkar/pfam --version Pfam37.4 --set-default
$ drakkar database vfdb --directory /projects/alberdilab/data/databases/drakkar/vfdb --set-default
Options:
--directory: base directory where the release folder will be created.--version: folder name to create inside--directory. Forkegg, use the KEGG archive date such as2026-02-01. Forcazy, use the upstream dbCAN release label such asV14. Forpfam, use the Pfam release directory name such asPfam37.4. Foramr, use the NCBI AMRFinder release directory name such as2025-07-16.1. Forvfdb, you can omit--versionand DRAKKAR will use the UTC download date.--download-runtime: runtime in minutes for the database download and preparation rule (default:120).--set-default: update the corresponding database path inconfig.yamlafter installation.-e/--env_path: shared Conda environment directory.-p/--profile: Snakemake profile.
Behavior:
The selected database is installed into
--directory/--version/.For managed annotation databases,
config.yamlstores the release directory, not the internal HMM or MMseqs prefix file.The workflow resolves the expected internal files automatically, for example
kofams,pfam,amr.tsv, orvfdb.--set-defaultrewrites that config entry to the newly installed release directory.
Database-specific rules:
kegg(alias:kofams): use a KEGG archive date inYYYY-MM-DDformat, such as2026-02-01. DRAKKAR downloadsprofiles.tar.gzfromhttps://www.genome.jp/ftp/db/kofam/archives/<version>/, extracts the HMM profiles, concatenates them into a singlekofamsdatabase, downloads the KEGG hierarchy JSON, and runshmmpress. If the archive is missing, DRAKKAR points you tohttps://www.genome.jp/ftp/db/kofam/archives/. The default--download-runtimeis120minutes and is mainly intended for this large download.cazy: use the dbCAN release label, such asV14. DRAKKAR downloads the dbCAN HMM database fromhttps://pro.unl.edu/dbCAN2/download_file.php?file=Databases/<version>/dbCAN-HMMdb-<version>.txtand runshmmpress. If the requested release is missing, DRAKKAR points you tohttps://pro.unl.edu/dbCAN2/browse_download.php.pfam: use the Pfam release directory name, such asPfam37.4. DRAKKAR downloadsPfam-A.hmm.gzfromhttps://ftp.ebi.ac.uk/pub/databases/Pfam/releases/<version>/, downloads the EC mapping table, unzips the HMM file, and runshmmpress. If the requested release is missing, DRAKKAR points you tohttps://ftp.ebi.ac.uk/pub/databases/Pfam/releases/.amr: use the NCBI AMRFinder release directory name, such as2025-07-16.1. DRAKKAR downloads bothNCBIfam-AMRFinder.HMM.tar.gzandNCBIfam-AMRFinder.tsvfromhttps://ftp.ncbi.nlm.nih.gov/hmm/NCBIfam-AMRFinder/<version>/, merges the extracted HMMs into one database, and runshmmpress. If the requested release is missing, DRAKKAR points you tohttps://ftp.ncbi.nlm.nih.gov/hmm/NCBIfam-AMRFinder/.vfdb: there is no upstream version directory. DRAKKAR downloads the currentVFDB_setB_pro.fas.gzfromhttps://www.mgc.ac.cn/VFs/Down/VFDB_setB_pro.fas.gz, creates the MMseqs2 database, and if--versionis omitted it uses the UTC download date as the release folder and logged version.
Version logging:
Each run writes
database_versions.yamlinside the installed release directory.The log records the requested version, resolved install directory, source URLs, source-version label, and installed asset checksums and file sizes.
Config
Views or edits the installed DRAKKAR configuration file at
drakkar/workflow/config.yaml.
$ drakkar config --view
$ drakkar config --edit
Options:
--view: print the config file path and contents.--edit: open the config file in a terminal editor.
Behavior:
--edituses$VISUAL, then$EDITOR, then falls back tonano,vim, orvi.The command edits the installed package config directly, so changes affect later workflow runs from that installation.
Snakemake and SLURM management
Every workflow subcommand (complete, preprocessing, cataloging,
profiling, annotating, expressing, dereplicating,
inspecting, database, and environments) accepts the flags
described in this section. They let you tune resource limits, override
Snakemake profile settings, and pass SLURM directives without editing profile
files.
Resource caps (config.yaml)
drakkar/workflow/config.yaml contains four resource-related keys that act
as cluster-wide guardrails:
SNAKEMAKE_MAX_GB: maximum memory any single rule may request, in gigabytes. Default:1024. Dynamic per-rule memory requests are capped at this value.SNAKEMAKE_MAX_TIME: maximum runtime any single rule may request, in minutes. Default:20160(14 days).MEMORY_MULTIPLIER: a global integer factor applied to every per-rule memory request before theSNAKEMAKE_MAX_GBcap is enforced. Default:1. Increase this when a workflow consistently runs out of memory due to unusually large samples.TIME_MULTIPLIER: equivalent factor for runtime requests before theSNAKEMAKE_MAX_TIMEcap. Default:1. Increase when jobs time out on a slow or heavily loaded cluster.
Edit these values with drakkar config --edit or set them on the command
line with the flags below.
Resource multiplier flags
--memory-multiplier N and --time-multiplier N apply the same scaling
as MEMORY_MULTIPLIER / TIME_MULTIPLIER in config.yaml but without
permanently changing the installed config. The command-line value overrides the
config value for that run only.
$ drakkar cataloging -f input.tsv -o drakkar_output --memory-multiplier 2
$ drakkar profiling -b /path/to/bins -o drakkar_output --time-multiplier 3
Both flags accept any positive integer. They are most useful when a specific workflow run is expected to be unusually resource-intensive.
Snakemake override flags
These flags override the corresponding settings in the active Snakemake profile without modifying profile files. All are optional; omitting a flag leaves the profile value in effect.
--snakemake-jobs N: maximum number of concurrent SLURM jobs. Overrides the profile value (typical default:100).--snakemake-cores N: maximum local CPU cores when using the local executor. Overrides the profile value.--snakemake-executor EXECUTOR: Snakemake executor plugin, e.g.slurmorlocal. Overrides the profile value.--snakemake-latency-wait N: seconds to wait for output files before failing a rule. Overrides the profile value (slurm default:300, local default:60). Raise this on shared filesystems with high metadata latency.--snakemake-retries N: number of times to retry a failed job. Overrides the profile value (slurm default:3).--snakemake-rerun-incomplete: force rerun of jobs whose output files were left incomplete by a previous interrupted run.--snakemake-keep-going: continue running independent jobs after a failure instead of stopping immediately.
Examples:
$ drakkar complete -f input.tsv -o drakkar_output --snakemake-jobs 50 --snakemake-retries 5
$ drakkar cataloging -f input.tsv -o drakkar_output --snakemake-executor local --snakemake-cores 32
$ drakkar profiling -b bins/ -o drakkar_output --snakemake-rerun-incomplete --snakemake-keep-going
SLURM override flags
These flags inject SLURM directives into Snakemake’s --default-resources
without requiring changes to the SLURM profile or cluster config.
--slurm-partition NAME: SLURM partition (queue) to submit all jobs to.--slurm-account NAME: SLURM billing account.--slurm-constraint EXPR: node constraint expression, e.g.gpuorskylake.--slurm-nodes N: number of nodes per SLURM job (default:1).--slurm-nodelist NODES: restrict jobs to a specific node or node list, e.g.node01ornode[01-03].--slurm-extra ARGS: arbitrary extrasbatcharguments passed verbatim, e.g.'--mail-type=END --mail-user=you@example.com'.
Examples:
$ drakkar complete -f input.tsv -o drakkar_output --slurm-partition gpu --slurm-account myproject
$ drakkar annotating -b bins/ -o drakkar_output --slurm-extra '--mail-type=END --mail-user=you@example.com'
SLURM benchmarking
After each workflow run, DRAKKAR queries sacct for the jobs submitted
during that run and writes a resource-efficiency summary. This produces:
benchmark/: per-job resource tables under the output directory.drakkar_<run_id>_resources.yaml: root-level summary of CPU time, memory peaks, and efficiency ratios for the run.
The resource summary is also shown by drakkar logging alongside the
workflow execution summary.
To skip benchmark collection, pass --skip-benchmark to any workflow
command:
$ drakkar preprocessing -i /path/to/reads -o drakkar_output --skip-benchmark
Status
Shows progress for the latest or selected Drakkar workflow run without restarting Snakemake.
$ drakkar status
$ drakkar status -d drakkar_output --rules
$ drakkar status drakkar_20260510-032711.yaml --samples
Options:
target: optional output directory ordrakkar_<run_id>.yamlmetadata file. If omitted, DRAKKAR inspects the current directory.-d/--directoryor-o/--output: output directory to inspect.--run: specific run ID ordrakkar_<run_id>.yamlfile name.--rules: show rule-focused progress only.--samples: show sample-focused progress only.--complete: include helper rules that are hidden by default.
Behavior:
The default view shows overall progress, rule progress for main rules, and sample-stage progress.
Rule totals are parsed from the captured Snakemake job stats and completion lines in
log/drakkar_<run_id>.snakemake.log.Sample stages are inferred from observed sample or assembly wildcards and the workflow sample dictionaries under
data/.
Logging
Inspects workflow metadata and persistent Snakemake logs to troubleshoot failed or interrupted runs.
$ drakkar logging -o drakkar_output
$ drakkar logging -o drakkar_output --summary
$ drakkar logging -o drakkar_output --run 20260503-101530 --paths
Options:
-o/--output: output directory to inspect.--run: specific run ID (YYYYMMDD-HHMMSS) ordrakkar_<run_id>.yamlfile name.--summary: print only the parsed workflow summary.--tail: number of trailing log lines to show if no failure excerpt is found and--summaryis not used (default:50).--full: print the full Snakemake log.--paths: list relevant metadata and log file paths.--list: list available workflow runs in the output directory.
Behavior:
Workflow runs write root metadata files such as
drakkar_20260503-101530.yaml.Snakemake stdout/stderr is captured persistently in
log/drakkar_20260503-101530.snakemake.log.The default logging view includes a parsed execution summary with planned jobs, observed rule executions, workflow progress, and detected error types.
If the output directory is locked, run
drakkar logging -o <output_dir>before usingdrakkar unlockor--overwrite.
Transfer
Transfers outputs via SFTP while preserving the original folder structure. The remote base directory must already exist.
$ drakkar transfer --host example.org --user you -l drakkar_output -r /remote/path --results -v
Flags:
--all: transfer the entire output directory.--data: transfer everything except.snakemake.--results: transfer the union of-a/-m/-p/-b/-e.-a/--annotations: annotation outputs.-m/--mags: dereplicated MAGs.-p/--profile: profiling outputs.-e/--expression: expression outputs.-b/--bins: cataloging bins recursively.--erda: use ERDA defaults (io.erda.dk).-v/--verbose: log each transfer.
Maintenance commands
Unlock a working directory if Snakemake left a lock:
$ drakkar unlock -o drakkar_output
Update DRAKKAR in the current environment:
$ drakkar update
Pass --skip-deps to refresh the package without reinstalling Python
dependencies (useful when only the workflow scripts have changed):
$ drakkar update --skip-deps
Outputs
Key output locations:
preprocessing/: cleaned reads and preprocessing summaries.cataloging/: assemblies, bins, and bin metadata.cataloging.tsv: assembly, mapping, and binning summary table.profiling_genomes/: dereplication, mapping, and abundance tables.profiling_pangenomes/: pangenome profiling outputs.annotating/: annotation tables.expressing/: expression outputs.dereplicating/: dereplicated genomes in dereplication-only mode.benchmark/: per-SLURM-job resource tables written after each workflow run.drakkar_<run_id>.yaml: workflow run metadata.drakkar_<run_id>_resources.yaml: root-level SLURM resource-efficiency summary for the run (CPU time, memory peaks, and efficiency ratios).log/drakkar_<run_id>.snakemake.log: persistent Snakemake stdout/stderr capture for a workflow run.<directory>/<version>/database_versions.yaml: installation log for a managed database release.
Troubleshooting
Locked directory: first run
drakkar logging -o <output_dir>to inspect the latest workflow log, then usedrakkar unlock -o <output_dir>or rerun with--overwrite.Missing bins: provide
-b/--bins_diror-B/--bins_file.Missing reads: provide
-r/--reads_diror-R/--reads_file.SFTP errors: ensure the remote directory exists and the credentials are valid.