Class: Oga::XML::SaxParser
- Defined in:
- lib/oga/xml/sax_parser.rb
Overview
The SaxParser class provides the basic interface for writing custom SAX parsers. All callback methods defined in Parser are delegated to a dedicated handler class.
To write a custom handler for the SAX parser, create a class that implements one (or many) of the following callback methods:
on_document
on_doctype
on_cdata
on_comment
on_proc_ins
on_xml_decl
on_text
on_element
on_element_children
on_attribute
on_attributes
after_element
For example:
class SaxHandler
def on_element(namespace, name, attrs = {})
puts name
end
end
You can then use it as following:
handler = SaxHandler.new
parser = Oga::XML::SaxParser.new(handler, '<foo />')
parser.parse
For information on the callback arguments see the documentation of the corresponding methods in Parser.
Element Callbacks
The SAX parser changes the behaviour of both on_element
and
after_element
. The latter in the regular parser only takes a
Element instance. In the SAX parser it will instead take a
namespace name and the element name. This eases the process of figuring
out what element a callback is associated with.
An example:
class SaxHandler
def on_element(namespace, name, attrs = {})
# ...
end
def after_element(namespace, name)
puts name # => "foo", "bar", etc
end
end
Attributes
Attributes returned by on_attribute
are passed as an Hash as the 3rd
argument of the on_element
callback. The keys of this Hash are the
attribute names (optionally prefixed by their namespace) and their values.
You can overwrite on_attribute
to control individual attributes and
on_attributes
to control the final set.
Direct Known Subclasses
Constant Summary
Constants inherited from Parser
Parser::CONFIG, Parser::TOKEN_ERROR_MAPPING
Instance Method Summary collapse
-
#after_element(namespace_with_name) ⇒ Object
Manually define
after_element
so it can take a namespace and name. -
#initialize(handler, *args) ⇒ SaxParser
constructor
A new instance of SaxParser.
-
#on_attribute(name, ns = nil, value = nil) ⇒ Object
Manually define this method since for this one we do want the return value so it can be passed to
on_element
. -
#on_attributes(attrs) ⇒ Hash
Merges the attributes together into a Hash.
-
#on_element(namespace, name, attrs = []) ⇒ Array
Manually define
on_element
so we can ensure thatafter_element
always receives the namespace and name. -
#on_text(text) ⇒ Object
Methods inherited from Parser
#_rule_0, #_rule_1, #_rule_10, #_rule_11, #_rule_12, #_rule_13, #_rule_14, #_rule_15, #_rule_16, #_rule_17, #_rule_18, #_rule_19, #_rule_2, #_rule_20, #_rule_21, #_rule_22, #_rule_23, #_rule_24, #_rule_25, #_rule_26, #_rule_27, #_rule_28, #_rule_29, #_rule_3, #_rule_30, #_rule_31, #_rule_32, #_rule_33, #_rule_34, #_rule_35, #_rule_36, #_rule_37, #_rule_38, #_rule_39, #_rule_4, #_rule_40, #_rule_41, #_rule_42, #_rule_5, #_rule_6, #_rule_7, #_rule_8, #_rule_9, #each_token, #on_cdata, #on_comment, #on_doctype, #on_document, #on_element_children, #on_proc_ins, #on_xml_decl, #parser_error
Constructor Details
#initialize(handler, *args) ⇒ SaxParser
Returns a new instance of SaxParser
71 72 73 74 75 |
# File 'lib/oga/xml/sax_parser.rb', line 71 def initialize(handler, *args) @handler = handler super(*args) end |
Instance Method Details
#after_element(namespace_with_name) ⇒ Object
Manually define after_element
so it can take a namespace and name.
This differs a bit from the regular after_element
which only takes an
Element instance.
93 94 95 96 97 |
# File 'lib/oga/xml/sax_parser.rb', line 93 def after_element(namespace_with_name) run_callback(:after_element, *namespace_with_name) return end |
#on_attribute(name, ns = nil, value = nil) ⇒ Object
Manually define this method since for this one we do want the
return value so it can be passed to on_element
.
103 104 105 106 107 108 109 110 111 112 113 114 115 |
# File 'lib/oga/xml/sax_parser.rb', line 103 def on_attribute(name, ns = nil, value = nil) if @handler.respond_to?(:on_attribute) return run_callback(:on_attribute, name, ns, value) end key = ns ? "#{ns}:#{name}" : name if value value = EntityDecoder.try_decode(value, @lexer.html?) end {key => value} end |
#on_attributes(attrs) ⇒ Hash
Merges the attributes together into a Hash.
121 122 123 124 125 126 127 128 129 130 131 132 133 134 |
# File 'lib/oga/xml/sax_parser.rb', line 121 def on_attributes(attrs) if @handler.respond_to?(:on_attributes) return run_callback(:on_attributes, attrs) end merged = {} attrs.each do |pair| # Hash#merge requires an extra allocation, this doesn't. pair.each { |key, value| merged[key] = value } end merged end |
#on_element(namespace, name, attrs = []) ⇒ Array
Manually define on_element
so we can ensure that after_element
always receives the namespace and name.
82 83 84 85 86 |
# File 'lib/oga/xml/sax_parser.rb', line 82 def on_element(namespace, name, attrs = []) run_callback(:on_element, namespace, name, attrs) [namespace, name] end |
#on_text(text) ⇒ Object
137 138 139 140 141 142 143 144 145 146 147 |
# File 'lib/oga/xml/sax_parser.rb', line 137 def on_text(text) if @handler.respond_to?(:on_text) unless inside_literal_html? text = EntityDecoder.try_decode(text, @lexer.html?) end run_callback(:on_text, text) end return end |