Extending ngless and interacting with other projects [4/5]
NOTE: As of Apr 2016, ngless is available only as a pre-release to allow for testing of early versions by fellow scientists and to discusss ideas. We do not consider it /released/ and do not recommend use in production as some functionality is still in testing. Please get in touch if you are interested in using ngless in your projects.
This is the first of a series of five posts introducing ngless.
Extending and interacting with other projects [this post]
Miscellaneous
Extending and interacting with other projects
A frequently asked question about ngless is whether the language is extensible. Yes, it is. You can add modules using a simple text-only format (YaML). These modules can then add new functions to ngless. Behind the scenes, this results in command line calls to scripts you write.
For example, to integrate motus into ngless, I used a simple configuration file, which I am going to describe it here.
Every module has a name and a version:
name: 'motus' version: '0.0.0'
You can add a citation text. This will be shown to all users of your module (citing the software you use is a best practice, so we support it):
citation: "Metagenomic species profiling using universal phylogenetic marker genes"
You can add an init
command. This will run before anything else runs at the start of the interpretation. It should be quick and check that things are OK. For example, in this case, we check that Python is installed. Thus, if there is a problem, the user gets a fast error message before anything else is run.
init: init_cmd: './check-python.sh'
Now, we list the functions we are implementing:
function:
In this case, there is just one, corresponding to the ngless function motus
.
nglName: "motus"
arg0
is the command to run (which implements this function):
arg0: './run-python.sh'
In ngless functions have a single unnamed argument and any number of named arguments. So, we specify first arg1 which is a special
arg1: filetype: "tsv" can_gzip: true
The can_gzip
flag lets ngless know that it is OK to pass a compressed file to your script. Now, we list any additional arguments. In this case, there is a required argument:
additional: - atype: 'str' name: 'ofile' def: '' required: true
The argument is a string, without a default. That's it. Now, we can use the motus
function in a ngless script:
ngless "0.0" import "motus" version "0.0.0" input = paired('data/reads.1.fq.gz', 'data/reads.2.fq.gz') preprocess(input, keep_singles=False) using |read|: read = substrim(read, min_quality=25) if len(read) < 45: discard mapped = map(input, ref='motus') mapped = select(mapped) using |mread|: mread = mread.filter(min_identity_pc=97) counted = count(mapped, gff_file='motus.gtf.gz', features=['gene'], multiple={dist1}) motus(counted, ofile='motus-counts.txt')
What can modules do? An external module can add new functions (will result in a call to a script, which will often be a wrapper around some tool). add new reference information (new catalogs, &c). This can even be downloaded on demand (currently [Apr 2016], the module init script must do this itself; in the future, ngless will support just a URL). add a citation so that all users of the module will see the citation message. This ensures that if you develop a package which gets wrapped into an ngless module, those final users will still see your citation.