A new efficient referential genome compression technique for FastQ files

Bhukya R et al (2020) Compression for DNA sequences using Huffman encoding. In: Information and Communication Technology for Sustainable Development. Springer, Singapore, pp 615–624

Chapter  Google Scholar 

Bonfield JK, Mahoney MV (2013) Compression of FASTQ and SAM format sequencing data. Plos One 8(3):e59190

Article  CAS  PubMed  PubMed Central  Google Scholar 

Chandak S et al (2018) SPRING: a next-generation compressor for FASTQ data. Bioinformatics

Google Scholar 

Deorowicz S, Grabowski S (2011) Compression of DNA sequence reads in FASTQ format. Bioinformatics 27(6):860–862

Article  CAS  PubMed  Google Scholar 

Dutta A, Haque MM, Bose T, Reddy CV, Mande SS (2015) FQC: a novel approach for efficient compression, archival, and dissemination of FastQ datasets. J Bioinform Comput Biol 13(3):1541003

Article  CAS  PubMed  Google Scholar 

Genome is digital, and can be compressed, 2022 Available at: https://blog.chiariglione.org/genome-is-digital-and-can-be-compressed/ [21-5-2022]

Guerra A et al (2019) Tackling the challenges of FASTQ referential compression. Bioinform Biol Insights 13:1177932218821373

Article  PubMed  PubMed Central  Google Scholar 

Huang ZA, Wen Z, Deng Q, Chu Y, Sun Y, Zhu Z (2017) LW-FQZip 2: a parallelized reference-base compression of FASTQ files. BMC Bioinform 18(1):179

Article  Google Scholar 

Jian DD et al (2020) Genome compression and decompression. U.S. Patent No. 10,679,727

Google Scholar 

Jones DC, Ruzzo WL, Peng X, Katze MG (2012) Compression of next-generation sequencing reads aided by highly efficient de novo assembly. Nucleic Acids Res 40(22):e171

Article  CAS  PubMed  PubMed Central  Google Scholar 

Kowalski TM, Grabowski S (2020) PgRC: pseudogenome-based read compressor. Bioinformatics 36(7):2082–2089

Article  CAS  PubMed  Google Scholar 

Kredens KV et al (2020) Vertical lossless genomic data compression tools for assembled genomes: a systematic literature review. Plos One 15(5):e0232942

Article  CAS  PubMed  PubMed Central  Google Scholar 

Kryukov K et al (2020) Sequence Compression Benchmark (SCB) database—a comprehensive evaluation of reference-free compressors for FASTA-formatted sequences. GigaScience 9(7):giaa072. https://www.ncbi.nlm.nih.gov/sra. Accessed Jun 2022

Article  PubMed  PubMed Central  Google Scholar 

Kumar S, Agarwal S (2018) WBFQC: a new approach for compressing next-generation sequencing data splitting into homogeneous streams. J Bioinforma Comput Biol 1850018

Kumar S, Agarwal S, Prasad R (2015) Efficient read alignment using burrows wheeler transform and wavelet tree. (ICACCE), 2015 Second International Conference on 2015 May 1. IEEE, pp 133–138

Google Scholar 

Lee SJ, Cho GY, Ikeno F, Lee TR (2018) BAQALC: blockchain applied lossless efficient transmission of DNA sequencing data for next generation medical informatics. Appl Sci 8(9):1471

Article  Google Scholar 

Liu Y, Peng H, Wong L, Li J (2017) High-speed and high-ratio referential genome compression. Bioinformatics 33(21):3364–3372

Article  CAS  PubMed  Google Scholar 

Mansouri D, Yuan X, Saidani A (2020) A new lossless DNA compression algorithm based on a single-block encoding scheme. Algorithms 13(4):99

Article  Google Scholar 

Nicolae M, Pathak S, Rajasekaran S (2015) LFQC: a lossless compression algorithm for FASTQ files. Bioinformatics 31(20):3276–3281

Article  CAS  PubMed  PubMed Central  Google Scholar 

Rabbani L, Müller J, Weigel D (2020) An algorithm to build a multi-genome reference. bioRxiv

Book  Google Scholar 

Roguski DS (2014) DSRC 2Industry-oriented compression of FASTQ files. Bioinformatics 30(15):2213–2215

Article  CAS  PubMed  Google Scholar 

Shokrof M, Abouelhoda M (2020) IonCRAM: a reference-based compression tool for ion torrent sequence files. BMC Bioinform 21(1):1–16

Google Scholar 

Sultan AY, Huang C-H (2019) LFastqC: a lossless non-reference-based FASTQ compressor. Plos One 14:11

Google Scholar 

Tembe W, Lowey J, Suh E (2010) G-SQZ: compact encoding of genomic sequence and quality data. Bioinformatics 26(17):2192–2194

Article  CAS  PubMed  Google Scholar 

Wan R, Anh VN, Asai K (2011) Transformations for the compression of FASTQ quality scores of next generation sequencing data. Bioinformatics 28(5):628–635

Article  PubMed  Google Scholar 

Wandelt S, Bux M, Leser U (2014) Trends in genome compression. Curr Bioinform 9:3

Article  Google Scholar 

Yu R, Yang W (2020) ScaleQC: a scalable lossy to lossless solution for NGS data compression. Bioinformatics

Google Scholar 

Zhang Y, Li L, Yang Y, Yang X, He S, Zhu Z (2015) Light-weight reference-based compression of FASTQ data. BMC Bioinform 16(1):188

Article  Google Scholar 

Comments (0)

No login
gif