| Module | Bio::Alignment::SiteMethods |
| In: |
lib/bio/alignment.rb
|
Bio::Alignment::SiteMethods is a set of methods for Bio::Alignment::Site. It can also be used for extending an array of single-letter strings.
| IUPAC_NUC | = | [ %w( t u ), %w( m a c ), %w( r a g ), %w( w a t u ), %w( s c g ), %w( y c t u ), %w( k g t u ), %w( v a c g m r s ), %w( h a c t u m w y ), %w( d a g t u r w k ), %w( b c g t u s y k ), %w( n a c g t u m r w s y k v h d b ) | IUPAC nucleotide groups. Internal use only. | |
| StrongConservationGroups | = | %w(STA NEQK NHQK NDEQ QHRK MILV MILF HY FYW).collect { |x| x.split('').sort } |
Table of strongly conserved amino-acid groups.
The value of the tables are taken from BioPerl (Bio/SimpleAlign.pm in BioPerl 1.0), and the BioPerl‘s document says that it is taken from Clustalw documentation and These are all the positively scoring groups that occur in the Gonnet Pam250 matrix. The strong and weak groups are defined as strong score >0.5 and weak score =<0.5 respectively. |
|
| WeakConservationGroups | = | %w(CSA ATV SAG STNK STPA SGND SNDEQK NDEQHK NEQHRK FVLIM HFY).collect { |x| x.split('').sort } |
Table of weakly conserved amino-acid groups.
Please refer StrongConservationGroups document for the origin of the table. |
Returns an IUPAC consensus base for the site. If consensus is found, eturns a single-letter string. If not, returns nil.
# File lib/bio/alignment.rb, line 218
218: def consensus_iupac
219: a = self.collect { |x| x.downcase }.sort.uniq
220: if a.size == 1 then
221: case a[0]
222: when 'a', 'c', 'g', 't'
223: a[0]
224: when 'u'
225: 't'
226: else
227: IUPAC_NUC.find { |x| a[0] == x[0] } ? a[0] : nil
228: end
229: elsif r = IUPAC_NUC.find { |x| (a - x).size <= 0 } then
230: r[0]
231: else
232: nil
233: end
234: end
Returns consensus character of the site. If consensus is found, eturns a single-letter string. If not, returns nil.
# File lib/bio/alignment.rb, line 181
181: def consensus_string(threshold = 1.0)
182: return nil if self.size <= 0
183: return self[0] if self.sort.uniq.size == 1
184: h = Hash.new(0)
185: self.each { |x| h[x] += 1 }
186: total = self.size
187: b = h.to_a.sort do |x,y|
188: z = (y[1] <=> x[1])
189: z = (self.index(x[0]) <=> self.index(y[0])) if z == 0
190: z
191: end
192: if total * threshold <= b[0][1] then
193: b[0][0]
194: else
195: nil
196: end
197: end
If there are gaps, returns true. Otherwise, returns false.
# File lib/bio/alignment.rb, line 164
164: def has_gap?
165: (find { |x| is_gap?(x) }) ? true : false
166: end
Returns the match-line character for the site. This is amino-acid version.
# File lib/bio/alignment.rb, line 258
258: def match_line_amino(opt = {})
259: # opt[:match_line_char] ==> 100% equal default: '*'
260: # opt[:strong_match_char] ==> strong match default: ':'
261: # opt[:weak_match_char] ==> weak match default: '.'
262: # opt[:mismatch_char] ==> mismatch default: ' '
263: mlc = (opt[:match_line_char] or '*')
264: smc = (opt[:strong_match_char] or ':')
265: wmc = (opt[:weak_match_char] or '.')
266: mmc = (opt[:mismatch_char] or ' ')
267: a = self.collect { |c| c.upcase }.sort.uniq
268: a.extend(SiteMethods)
269: if a.has_gap? then
270: mmc
271: elsif a.size == 1 then
272: mlc
273: elsif StrongConservationGroups.find { |x| (a - x).empty? } then
274: smc
275: elsif WeakConservationGroups.find { |x| (a - x).empty? } then
276: wmc
277: else
278: mmc
279: end
280: end
Returns the match-line character for the site. This is nucleic-acid version.
# File lib/bio/alignment.rb, line 284
284: def match_line_nuc(opt = {})
285: # opt[:match_line_char] ==> 100% equal default: '*'
286: # opt[:mismatch_char] ==> mismatch default: ' '
287: mlc = (opt[:match_line_char] or '*')
288: mmc = (opt[:mismatch_char] or ' ')
289: a = self.collect { |c| c.upcase }.sort.uniq
290: a.extend(SiteMethods)
291: if a.has_gap? then
292: mmc
293: elsif a.size == 1 then
294: mlc
295: else
296: mmc
297: end
298: end