marinetics blog


Bioinformatics Solutions in Ecology and Evolution

02 Apr 2015

Count bases in Fasta file

Tagged: PERL NGS Fasta






Related Posts

The script in this blog post allows you to count the number of bases or amino acids in a fasa file. This can be useful, for example, to identify the size of a genome or an assembly.

Download

You can download the script CountBasesInFastaFile.pl from https://gist.github.com/alj1983/9b3bfe7a8b7883aabb6e/download

How to use it?

If the perl script and your fasta file are both located in the same folder, you just need to type:

perl CountBasesInFastafile.pl Fastafile.fasta

Replace Fastafile.fasta with the name of your own fastafile.

The output will look similar to:

Number of bases or amino acids in the fasta file: 422

Code

#!/usr/bin/perl -w

use strict;
use warnings;
# Script to count the number of bases or amino acids in a fasta file; 
if ($ARGV[0] eq '') { die "A fasta input file is needed"};

open(IN, $ARGV[0]) or die "Can not open file $ARGV[0]";

my $seq;
while(<IN>) {
    unless (/>/){
        chomp;
        $seq .= $_;
    }
}


close IN;

my $seqlength=length($seq);

print "Number of bases or amino acids in the fasta file: $seqlength\n";