pySeQuiLa is a Python wrapper for SeQuiLa, an ANSI-SQL compliant solution for distributed processing of next generation sequencing data built on top of Apache Spark.
pySeQuiLa extends Apache Spark with highly efficient implementations of common bioinformatics operations such as interval joins, depth of coverage or pileup.
It combines analytical power of Python with SQL syntax for almost unlimited querying and processing of NGS data.
Big data ready, cloud-native
Contributions welcome!
We do a Pull Request contributions workflow on GitHub. New users are always welcome!