| Class | Bio::GFF::GFF3::Record::Gap |
| In: |
lib/bio/db/gff.rb
|
| Parent: | Object |
Bio:GFF::GFF3::Record::Gap is a class to store data of "Gap" attribute.
| Code | = | Struct.new(:code, :length) | Code is a class to store length of single-letter code. |
| data | [R] | Internal data. Users must not use it. |
Arguments:
# File lib/bio/db/gff.rb, line 1246
1246: def initialize(str = nil)
1247: if str then
1248: @data = str.split(/ +/).collect do |x|
1249: if /\A([A-Z])([0-9]+)\z/ =~ x.strip then
1250: Code.new($1.intern, $2.to_i)
1251: else
1252: warn "ignored unknown token: #{x}.inspect" if $VERBOSE
1253: nil
1254: end
1255: end
1256: @data.compact!
1257: else
1258: @data = []
1259: end
1260: end
Creates a new Gap object from given sequence alignment.
Note that sites of which both reference and target are gaps are silently removed.
Arguments:
# File lib/bio/db/gff.rb, line 1362
1362: def self.new_from_sequences_na(reference, target,
1363: gap_regexp = /[^a-zA-Z]/)
1364: gap = self.new
1365: gap.instance_eval {
1366: __initialize_from_sequences_na(reference, target,
1367: gap_regexp)
1368: }
1369: gap
1370: end
Creates a new Gap object from given sequence alignment.
Note that sites of which both reference and target are gaps are silently removed.
For incorrect alignments that break 3:1 rule, gap positions will be moved inside codons, unwanted gaps will be removed, and some forward or reverse frameshift will be inserted.
For example,
atgg-taagac-att M V K - I
is treated as:
atggt<aagacatt M V K >>I
Incorrect combination of frameshift with frameshift or gap may cause undefined behavior.
Forward frameshifts are recomended to be indicated in the target sequence. Reverse frameshifts can be indicated in the reference sequence or the target sequence.
Priority of regular expressions:
space > forward/reverse frameshift > gap
Arguments:
# File lib/bio/db/gff.rb, line 1558
1558: def self.new_from_sequences_na_aa(reference, target,
1559: gap_regexp = /[^a-zA-Z]/,
1560: space_regexp = /\s/,
1561: forward_frameshift_regexp = /\>/,
1562: reverse_frameshift_regexp = /\</)
1563: gap = self.new
1564: gap.instance_eval {
1565: __initialize_from_sequences_na_aa(reference, target,
1566: gap_regexp,
1567: space_regexp,
1568: forward_frameshift_regexp,
1569: reverse_frameshift_regexp)
1570: }
1571: gap
1572: end
If self == other, returns true. otherwise, returns false.
# File lib/bio/db/gff.rb, line 1586
1586: def ==(other)
1587: if other.class == self.class and
1588: @data == other.data then
1589: true
1590: else
1591: false
1592: end
1593: end
Processes nucleotide sequences and returns gapped sequences as an array of sequences.
Note for forward/reverse frameshift: Forward/Reverse_frameshift is simply treated as gap insertion to the target/reference sequence.
Arguments:
# File lib/bio/db/gff.rb, line 1686
1686: def process_sequences_na(reference, target, gap_char = '-')
1687: s_ref, s_tgt = dup_seqs(reference, target)
1688:
1689: s_ref, s_tgt = __process_sequences(s_ref, s_tgt,
1690: gap_char, gap_char,
1691: 1, 1,
1692: gap_char, gap_char)
1693:
1694: if $VERBOSE and s_ref.length != s_tgt.length then
1695: warn "returned sequences not equal length"
1696: end
1697: return s_ref, s_tgt
1698: end
Processes sequences and returns gapped sequences as an array of sequences. reference must be a nucleotide sequence, and target must be an amino acid sequence.
Note for reverse frameshift: Reverse_frameshift characers are inserted in the reference sequence. For example, alignment of "Gap=M3 R1 M2" is:
atgaagat<aatgtc
M K I N V
Alignment of "Gap=M3 R3 M3" is:
atgaag<<<attaatgtc
M K I I N V
Arguments:
# File lib/bio/db/gff.rb, line 1723
1723: def process_sequences_na_aa(reference, target,
1724: gap_char = '-',
1725: space_char = ' ',
1726: forward_frameshift = '>',
1727: reverse_frameshift = '<')
1728: s_ref, s_tgt = dup_seqs(reference, target)
1729: s_tgt = s_tgt.gsub(/./, "\\0#{space_char}#{space_char}")
1730: ref_increment = 3
1731: tgt_increment = 1 + space_char.length * 2
1732: ref_gap = gap_char * 3
1733: tgt_gap = "#{gap_char}#{space_char}#{space_char}"
1734: return __process_sequences(s_ref, s_tgt,
1735: ref_gap, tgt_gap,
1736: ref_increment, tgt_increment,
1737: forward_frameshift,
1738: reverse_frameshift)
1739: end