Class: Oga::XML::PullParser
- Defined in:
- lib/oga/xml/pull_parser.rb
Overview
The PullParser class can be used to parse an XML document incrementally instead of parsing it as a whole. This results in lower memory usage and potentially faster parsing times. The downside is that pull parsers are typically more difficult to use compared to DOM parsers.
Basic parsing using this class works as following:
parser = Oga::XML::PullParser.new('... xml here ...')
parser.parse do |node|
if node.is_a?(Oga::XML::PullParser)
end
end
This parses yields proper XML instances such as Element. Doctypes and XML declarations are ignored by this parser.
Constant Summary collapse
- DISABLED_CALLBACKS =
[ :on_document, :on_doctype, :on_xml_decl, :on_element_children ]
- BLOCK_CALLBACKS =
[ :on_cdata, :on_comment, :on_text, :on_proc_ins ]
- NODE_SHORTHANDS =
Returns the shorthands that can be used for various node classes.
{ :text => XML::Text, :node => XML::Node, :cdata => XML::Cdata, :element => XML::Element, :doctype => XML::Doctype, :comment => XML::Comment, :xml_declaration => XML::XmlDeclaration }
Constants inherited from Parser
Oga::XML::Parser::CONFIG, Oga::XML::Parser::TOKEN_ERROR_MAPPING
Instance Attribute Summary collapse
-
#nesting ⇒ Array
readonly
Array containing the names of the currently nested elements.
-
#node ⇒ Oga::XML::Node
readonly
Instance Method Summary collapse
-
#after_element(*args) ⇒ Object
-
#initialize(*args) ⇒ PullParser
constructor
A new instance of PullParser.
-
#on(type, nesting = []) ⇒ Object
Calls the supplied block if the current node type and optionally the nesting match.
-
#on_element(*args) ⇒ Object
-
#parse {|| ... } ⇒ Object
Parses the input and yields every node to the supplied block.
Methods inherited from Parser
#_rule_0, #_rule_1, #_rule_10, #_rule_11, #_rule_12, #_rule_13, #_rule_14, #_rule_15, #_rule_16, #_rule_17, #_rule_18, #_rule_19, #_rule_2, #_rule_20, #_rule_21, #_rule_22, #_rule_23, #_rule_24, #_rule_25, #_rule_26, #_rule_27, #_rule_28, #_rule_29, #_rule_3, #_rule_30, #_rule_31, #_rule_32, #_rule_33, #_rule_34, #_rule_35, #_rule_36, #_rule_37, #_rule_38, #_rule_39, #_rule_4, #_rule_40, #_rule_41, #_rule_42, #_rule_5, #_rule_6, #_rule_7, #_rule_8, #_rule_9, #each_token, #on_attribute, #on_attributes, #on_cdata, #on_comment, #on_doctype, #on_document, #on_element_children, #on_proc_ins, #on_text, #on_xml_decl, #parser_error
Constructor Details
#initialize(*args) ⇒ PullParser
Returns a new instance of PullParser
57 58 59 60 |
# File 'lib/oga/xml/pull_parser.rb', line 57 def initialize(*args) super @nesting = [] end |
Instance Attribute Details
#nesting ⇒ Array (readonly)
Array containing the names of the currently nested elements.
26 27 28 |
# File 'lib/oga/xml/pull_parser.rb', line 26 def nesting @nesting end |
#node ⇒ Oga::XML::Node (readonly)
22 23 24 |
# File 'lib/oga/xml/pull_parser.rb', line 22 def node @node end |
Instance Method Details
#after_element(*args) ⇒ Object
146 147 148 149 150 |
# File 'lib/oga/xml/pull_parser.rb', line 146 def after_element(*args) nesting.pop return end |
#on(type, nesting = []) ⇒ Object
Calls the supplied block if the current node type and optionally the nesting match. This method allows you to write this:
parser.parse do |node|
parser.on(:text, %w{people person name}) do
puts node.text
end
end
Instead of this:
parser.parse do |node|
if node.is_a?(Oga::XML::Text) and parser.nesting == %w{people person name}
puts node.text
end
end
When calling this method you can specify the following node types:
:cdata
:comment
:element
:text
106 107 108 109 110 111 112 |
# File 'lib/oga/xml/pull_parser.rb', line 106 def on(type, nesting = []) if node.is_a?(NODE_SHORTHANDS[type]) if nesting.empty? or nesting == self.nesting yield end end end |
#on_element(*args) ⇒ Object
135 136 137 138 139 140 141 142 143 |
# File 'lib/oga/xml/pull_parser.rb', line 135 def on_element(*args) @node = super nesting << @node.name @block.call(@node) return end |
#parse {|| ... } ⇒ Object
Parses the input and yields every node to the supplied block.
65 66 67 68 69 70 71 |
# File 'lib/oga/xml/pull_parser.rb', line 65 def parse(&block) @block = block super return end |