| Class | Bio::Blast::Fastacmd |
| In: |
lib/bio/io/fastacmd.rb
|
| Parent: | Object |
Retrieves FASTA formatted sequences from a blast database using NCBI fastacmd command.
This class requires ‘fastacmd’ command and a blast database (formatted using the ’-o’ option of ‘formatdb’).
require 'bio'
fastacmd = Bio::Blast::Fastacmd.new("/db/myblastdb")
entry = fastacmd.get_by_id("sp:128U_DROME")
fastacmd.fetch("sp:128U_DROME")
fastacmd.fetch(["sp:1433_SPIOL", "sp:1432_MAIZE"])
fastacmd.fetch(["sp:1433_SPIOL", "sp:1432_MAIZE"]).each do |fasta|
puts fasta
end
| database | [RW] | Database file path. |
| fastacmd | [RW] | fastacmd command file path. |
This method provides a handle to a BLASTable database, which you can then use to retrieve sequences.
Prerequisites:
For example, suppose the original input file looks like:
>my_seq_1 ACCGACCTCCGGAACGGATAGCCCGACCTACG >my_seq_2 TCCGACCTTTCCTACCGCACACCTACGCCATCAC ...
and you‘ve created a BLASTable database from that with the command
cd /my_dir/ formatdb -i my_input_file -t Test -n Test -o T
then you can get a handle to this database with the command
fastacmd = Bio::Blast::Fastacmd.new("/my_dir/Test")
Arguments:
# File lib/bio/io/fastacmd.rb, line 81
81: def initialize(blast_database_file_path)
82: @database = blast_database_file_path
83: @fastacmd = 'fastacmd'
84: end
Iterates over all sequences in the database.
fastacmd.each_entry do |fasta| p [ fasta.definition[0..30], fasta.seq.size ] end
| Returns: | a Bio::FastaFormat object for each iteration |
# File lib/bio/io/fastacmd.rb, line 130
130: def each_entry
131: cmd = [ @fastacmd, '-d', @database, '-D', '1' ]
132: Bio::Command.call_command(cmd) do |io|
133: io.close_write
134: Bio::FlatFile.open(Bio::FastaFormat, io) do |f|
135: f.each_entry do |entry|
136: yield entry
137: end
138: end
139: end
140: self
141: end
Get the sequence for a list of IDs in the database.
For example:
p fastacmd.fetch(["sp:1433_SPIOL", "sp:1432_MAIZE"])
This method always returns an array of Bio::FastaFormat objects, even when the result is a single entry.
Arguments:
| Returns: | array of Bio::FastaFormat objects |
# File lib/bio/io/fastacmd.rb, line 109
109: def fetch(list)
110: if list.respond_to?(:join)
111: entry_id = list.join(",")
112: else
113: entry_id = list
114: end
115:
116: cmd = [ @fastacmd, '-d', @database, '-s', entry_id ]
117: Bio::Command.call_command(cmd) do |io|
118: io.close_write
119: Bio::FlatFile.new(Bio::FastaFormat, io).to_a
120: end
121: end
Get the sequence of a specific entry in the BLASTable database. For example:
entry = fastacmd.get_by_id("sp:128U_DROME")
Arguments:
| Returns: | a Bio::FastaFormat object |
# File lib/bio/io/fastacmd.rb, line 94
94: def get_by_id(entry_id)
95: fetch(entry_id).shift
96: end