| Class | Bio::NBRF |
| In: |
lib/bio/db/nbrf.rb
|
| Parent: | DB |
| DELIMITER | = | RS = "\n>" | Delimiter of each entry. Bio::FlatFile uses it. | |
| DELIMITER_OVERRUN | = | 1 | (Integer) excess read size included in DELIMITER. |
| entry_id | -> | accession |
| data | [RW] | sequence data of the entry (???) |
| definition | [RW] | Returns the description line of the NBRF/PIR formatted data. |
| entry_id | [RW] | Returns ID described in the entry. |
| entry_overrun | [R] | piece of next entry. Bio::FlatFile uses it. |
| seq_type | [RW] |
Returns sequence type described in the entry.
P1 (protein), F1 (protein fragment) DL (DNA linear), DC (DNA circular) RL (DNA linear), RC (DNA circular) N3 (tRNA), N1 (other functional RNA) |
Creates a new NBRF object. It stores the comment and sequence information from one entry of the NBRF/PIR format string. If the argument contains more than one entry, only the first entry is used.
# File lib/bio/db/nbrf.rb, line 45
45: def initialize(str)
46: str = str.sub(/\A[\r\n]+/, '') # remove first void lines
47: line1, line2, rest = str.split(/^/, 3)
48:
49: rest = rest.to_s
50: rest.sub!(/^>.*/m, '') # remove trailing entries for sure
51: @entry_overrun = $&
52: rest.sub!(/\*\s*\z/, '') # remove last '*' and "\n"
53: @data = rest
54:
55: @definition = line2.to_s.chomp
56: if /^>?([A-Za-z0-9]{2})\;(.*)/ =~ line1.to_s then
57: @seq_type = $1
58: @entry_id = $2
59: end
60: end
Creates a NBRF/PIR formatted text. Parameters can be omitted.
# File lib/bio/db/nbrf.rb, line 167
167: def self.to_nbrf(hash)
168: seq_type = hash[:seq_type]
169: seq = hash[:seq]
170: unless seq_type
171: if seq.is_a?(Bio::Sequence::AA) then
172: seq_type = 'P1'
173: elsif seq.is_a?(Bio::Sequence::NA) then
174: seq_type = /u/i =~ seq ? 'RL' : 'DL'
175: else
176: seq_type = 'XX'
177: end
178: end
179: width = hash.has_key?(:width) ? hash[:width] : 70
180: if width then
181: seq = seq.to_s + "*"
182: seq.gsub!(Regexp.new(".{1,#{width}}"), "\\0\n")
183: else
184: seq = seq.to_s + "*\n"
185: end
186: ">#{seq_type};#{hash[:entry_id]}\n#{hash[:definition]}\n#{seq}"
187: end
Returens the protein (amino acids) sequence. If you call aaseq for nucleic acids sequence, RuntimeError will be occurred. Use the method if you know whether the sequence is NA or AA.
# File lib/bio/db/nbrf.rb, line 143
143: def aaseq
144: if seq.is_a?(Bio::Sequence::NA) then
145: raise 'not nucleic but protein sequence'
146: elsif seq.is_a?(Bio::Sequence::AA) then
147: seq
148: else
149: Bio::Sequence::AA.new(seq)
150: end
151: end
Returens the nucleic acid sequence. If you call naseq for protein sequence, RuntimeError will be occurred. Use the method if you know whether the sequence is NA or AA.
# File lib/bio/db/nbrf.rb, line 122
122: def naseq
123: if seq.is_a?(Bio::Sequence::AA) then
124: raise 'not nucleic but protein sequence'
125: elsif seq.is_a?(Bio::Sequence::NA) then
126: seq
127: else
128: Bio::Sequence::NA.new(seq)
129: end
130: end
Returns sequence data. Returns Bio::Sequence::NA, Bio::Sequence::AA or Bio::Sequence, according to the sequence type.
# File lib/bio/db/nbrf.rb, line 107
107: def seq
108: unless defined?(@seq)
109: @seq = seq_class.new(@data.tr(" \t\r\n0-9", '')) # lazy clean up
110: end
111: @seq
112: end
Returns Bio::Sequence::AA, Bio::Sequence::NA, or Bio::Sequence, depending on sequence type.
# File lib/bio/db/nbrf.rb, line 91
91: def seq_class
92: case @seq_type
93: when /[PF]1/
94: # protein
95: Sequence::AA
96: when /[DR][LC]/, /N[13]/
97: # nucleic
98: Sequence::NA
99: else
100: Sequence
101: end
102: end