1

I am a newbie when it comes to using Nokogiri reader to parse an XML file. Here is the XML file I want to parse and sample code:

<?xml version='1.0' encoding='UTF-8'?>
<inventory>
  <tire name="super slick racing tire" />
  <tire name="all weather tire" />
</inventory>
-----------------------------------------------------------------
require 'rubygems'
require 'nokogiri'

io = File.open('test.xml', 'r')
reader = Nokogiri::XML::Reader(io)

reader.each do |node|

# node is an instance of Nokogiri::XML::Readerruby
puts node.name

end

The following is the error message I get:

pcs$ ruby1.9 TestNok.rb
WARNING: Nokogiri was built against LibXML version 2.6.32, but has dynamically loaded 2.7.5
/usr/lib/ruby/1.9.0/nokogiri/xml/reader.rb:60:in `read': ParsePI: PI xm never end ...  (Nokogiri::XML::SyntaxError)
from /usr/lib/ruby/1.9.0/nokogiri/xml/reader.rb:60:in `each'
from TestNok.rb:7:in `<main>'
<dummy toplevel>: [BUG] Segmentation fault
ruby 1.9.0 (2008-10-04 revision 19669) [i486-linux]

-- control frame ----------
c:0001 p:0000 s:0002 b:0002 l:000001 d:000001 TOP    
---------------------------
-- backtrace of native function call (Use addr2line) --
0xb08316
0xa285e7
0xa2866a
0xab1144
0x9a0410
0xa5f315
0xa2b994
0xa2baae
0x80487e8
0x469b56
0x80486e1
-------------------------------------------------------
Aborted

Any help would be greatly appreciated.

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
  • 1
    @Philip, before pasting code and stack traces into SO, put them in a text editor and indent everything by 4 spaces so SO recognizes it as text to be displayed verbatim. Otherwise it will interpret XML tags and other embedded characters as formatting. In your case a large part of the post was hidden because of this. – Jim Garrison Nov 12 '09 at 17:17
  • http://stackoverflow.com/q/5901400/128421 discusses this problem and how to fix it. – the Tin Man Nov 20 '11 at 08:45

2 Answers2

1

The code that you pasted works on my machine:

jablan@jablan-hp:~/dev$ ruby testxml.rb 
inventory
#text
tire
#text
tire
#text
inventory
jablan@jablan-hp:~/dev$ ruby -v
ruby 1.9.1p243 (2009-07-16 revision 24175) [i686-linux]
jablan@jablan-hp:~/dev$ gem list | grep nokogiri
nokogiri (1.4.0)
Mladen Jablanović
  • 43,461
  • 10
  • 90
  • 113
0

Seems be the reader parser duplicated data.

inventory
tire
tire
inventory

It must be:

inventory
tire
animuson
  • 53,861
  • 28
  • 137
  • 147
D.c
  • 1
  • With NokoGiri (and other stream parsers) you get the tags twice, once for the opening tag, and again for the closing of the same tag. In Nokogiri::XML::Reader, you can use node_type to determine if it's the opening or closing tag (or self-closing tag). – jamuraa Feb 13 '12 at 05:38