class Nokogiri::XML::Reader
The Reader parser allows you to effectively pull parse an XML document. Once instantiated, call Nokogiri::XML::Reader#each to iterate over each node.
Nokogiri::XML::Reader parses an XML document similar to the way a cursor would move. The Reader is given an XML document, and yields nodes to an each block.
The Reader parser might be good for when you need the speed and low memory usage of a SAX parser, but do not want to write a SAX::Document handler.
Here is an example of usage:
reader = Nokogiri::XML::Reader.new <<~XML <x xmlns:tenderlove='http://tenderlovemaking.com/'> <tenderlove:foo awesome='true'>snuggles!</tenderlove:foo> </x> XML reader.each do |node| # node is an instance of Nokogiri::XML::Reader puts node.name end
⚠ Nokogiri::XML::Reader#each can only be called once! Once the cursor moves through the entire document, you must parse the document again. It may be better to capture all information you need during a single iteration.
⚠ libxml2 does not support error recovery in the Reader parser. The RECOVER ParseOption is ignored. If a syntax error is encountered during parsing, an exception will be raised.
Constants
- TYPE_ATTRIBUTE
-
Attribute node type
- TYPE_CDATA
-
CDATAnode type - TYPE_COMMENT
-
Comment node type
- TYPE_DOCUMENT
-
Documentnode type - TYPE_DOCUMENT_FRAGMENT
-
DocumentFragment node type - TYPE_DOCUMENT_TYPE
-
DocumentType node type - TYPE_ELEMENT
-
Element node type
- TYPE_END_ELEMENT
-
Element end node type
- TYPE_END_ENTITY
-
Entity end node type
- TYPE_ENTITY
-
Entity node type
- TYPE_ENTITY_REFERENCE
-
Entity Reference node type
- TYPE_NONE
- TYPE_NOTATION
-
Notationnode type - TYPE_PROCESSING_INSTRUCTION
-
PI node type
- TYPE_SIGNIFICANT_WHITESPACE
-
Significant Whitespace node type
- TYPE_TEXT
-
Textnode type - TYPE_WHITESPACE
-
Whitespace node type
- TYPE_XML_DECLARATION
-
XML Declaration node type
Attributes
A list of errors encountered while parsing
The XML source
Public Class Methods
Source
# File lib/nokogiri/xml/reader.rb, line 99 def self.new( string_or_io, url_ = nil, encoding_ = nil, options_ = ParseOptions::STRICT, url: url_, encoding: encoding_, options: options_ ) options = Nokogiri::XML::ParseOptions.new(options) if Integer === options yield options if block_given? if string_or_io.respond_to?(:read) return Reader.from_io(string_or_io, url, encoding, options.to_i) end Reader.from_memory(string_or_io, url, encoding, options.to_i) end
Create a new Reader to parse an XML document.
- Required Parameters
-
input(String | IO): The XML document to parse.
- Optional Parameters
-
url:(String) The base URL of the document. -
encoding:(String) The name of the encoding of the document. -
options:(Integer |ParseOptions) Options to control the parser behavior. Defaults toParseOptions::STRICT.
- Yields
-
If present, the block will be passed a
Nokogiri::XML::ParseOptionsobject to modify before the fragment is parsed. SeeNokogiri::XML::ParseOptionsfor more information.
Public Instance Methods
Source
# File lib/nokogiri/xml/reader.rb, line 126 def attributes attribute_hash.merge(namespaces) end
Get the attributes and namespaces of the current node as a Hash.
This is the union of Reader#attribute_hash and Reader#namespaces
- Returns
-
(Hash<String, String>) Attribute names and values, and namespace prefixes and hrefs.
Source
# File lib/nokogiri/xml/reader.rb, line 132 def each while (cursor = read) yield cursor end end
Move the cursor through the document yielding the cursor to the block