[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] Replace pyxml/xmlproc-based XML validator with lxml based one.



Below is a newly generated patch, now also updating README with 
information on the new dependencies. Additionally the patch should
hopefully not be mangled anymore. However, I also attached the patch
this time in case there are problems again.

The wording in the README is obviously up to discussion. I felt
the need to change the "many distros" wording to "some" as more
recently distributions like Debian and Ubuntu are including
minidom in their core python packages. To be honest I have
not checked how other distros handle this nowadays, hence the
"some".

I also wanted to point out that I do have a xen-4.0-testing
hg repository where these changes can directly be pulled from
at [0]. From my own experience this can also be merged into
xen-unstable painlessly.

Regards,

Stephan

[0] http://bitbucket.org/sp/xen-4.0-testing-sp

--
Pyxml/xmlproc is being used in tools/xen/xm/xenapi_create.py but is
unmaintained for several years now. xmlproc is used only for validating
XML documents against a DTD file. 

This patch replaces the pyxml/xmlproc based XML validation with code
based on lxml, which is actively maintained.

Signed-off-by: Stephan Peijnik <spe@xxxxxxxxx>

diff -r 6e0ffcd2d9e0 -r 7082ce86e492 tools/python/xen/xm/xenapi_create.py
--- a/tools/python/xen/xm/xenapi_create.py      Fri Sep 17 17:06:57 2010 +0100
+++ b/tools/python/xen/xm/xenapi_create.py      Fri Oct 08 12:31:18 2010 +0200
@@ -14,13 +14,15 @@
 # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
 #============================================================================
 # Copyright (C) 2007 Tom Wilkie <tom.wilkie@xxxxxxxxx>
+# Copyright (C) 2010 ANEXIA Internetdienstleistungs GmbH
+#   Author: Stephan Peijnik <spe@xxxxxxxxx>
 #============================================================================
 """Domain creation using new XenAPI
 """
 
 from xen.xm.main import server, get_default_SR
 from xml.dom.minidom import parse, getDOMImplementation
-from xml.parsers.xmlproc import xmlproc, xmlval, xmldtd
+from lxml import etree
 from xen.xend import sxp
 from xen.xend.XendAPIConstants import XEN_API_ON_NORMAL_EXIT, \
      XEN_API_ON_CRASH_BEHAVIOUR
@@ -35,6 +37,7 @@
 from os.path import join
 import traceback
 import re
+import warnings # Used by lxml-based validator
 
 def log(_, msg):
     #print "> " + msg
@@ -118,62 +121,58 @@
         Use this if possible as it gives nice
         error messages
         """
-        dtd = xmldtd.load_dtd(self.dtd)
-        parser = xmlproc.XMLProcessor()
-        parser.set_application(xmlval.ValidatingApp(dtd, parser))
-        parser.dtd = dtd
-        parser.ent = dtd
-        parser.parse_resource(file)
-
+        try:
+            dtd = etree.DTD(open(self.dtd, 'r'))
+        except IOError:
+            # The old code did neither raise an exception here, nor
+            # did it report an error. For now we issue a warning.
+            # TODO: How to handle a missing dtd file?
+            # --sp
+            warnings.warn('DTD file %s not found.' % (self.dtd),
+                          UserWarning)
+            return
+        
+        tree = etree.parse(file)
+        root = tree.getroot()
+        if not dtd.validate(root):
+            self.handle_dtd_errors(dtd)
+            
     def check_dom_against_dtd(self, dom):
         """
         Check DOM again DTD.
         Doesn't give as nice error messages.
         (no location info)
         """
-        dtd = xmldtd.load_dtd(self.dtd)
-        app = xmlval.ValidatingApp(dtd, self)
-        app.set_locator(self)
-        self.dom2sax(dom, app)
+        try:
+            dtd = etree.DTD(open(self.dtd, 'r'))
+        except IOError:
+            # The old code did neither raise an exception here, nor
+            # did it report an error. For now we issue a warning.
+            # TODO: How to handle a missing dtd file?
+            # --sp
+            warnings.warn('DTD file %s not found.' % (self.dtd),
+                          UserWarning)
+            return
 
-    # Get errors back from ValidatingApp       
-    def report_error(self, number, args=None):
-        self.errors = xmlproc.errors.english
-        try:
-            msg = self.errors[number]
-            if args != None:
-                msg = msg % args
-        except KeyError:
-            msg = self.errors[4002] % number # Unknown err msg :-)
-        print msg 
+        # XXX: This may be a bit slow. Maybe we should use another way
+        # of getting an etree root element from the minidom DOM tree...
+        # -- sp
+        root = etree.XML(dom.toxml())
+        if not dtd.validate(root):
+            self.handle_dtd_errors(dtd)
+
+    # Do the same that was done in report_error before. This is directly
+    # called by check_dtd and check_dom_against_dtd.
+    # We are using sys.stderr instead of print though (python3k clean).
+    def handle_dtd_errors(self, dtd):
+        # XXX: Do we really want to bail out here?
+        # -- sp
+        for err in dtd.error_log:
+            err_str = 'ERROR: %s\n' % (str(err),)
+            sys.stderr.write(err_str)
+        sys.stderr.flush()
         sys.exit(-1)
 
-    # Here for compatibility with ValidatingApp
-    def get_line(self):
-        return -1
-
-    def get_column(self):
-        return -1
-
-    def dom2sax(self, dom, app):
-        """
-        Take a dom tree and tarverse it,
-        issuing SAX calls to app.
-        """
-        for child in dom.childNodes:
-            if child.nodeType == child.TEXT_NODE:
-                data = child.nodeValue
-                app.handle_data(data, 0, len(data))
-            else:
-                app.handle_start_tag(
-                    child.nodeName,
-                    self.attrs_to_dict(child.attributes))
-                self.dom2sax(child, app)
-                app.handle_end_tag(child.nodeName)
-
-    def attrs_to_dict(self, attrs):
-        return dict(attrs.items())     
-
     #
     # Checks which cannot be done with dtd
     #

diff -r 3ce0d5dc606f -r 76fd774f7cd1 README
--- a/README    Sat Oct 09 22:19:41 2010 +0200
+++ b/README    Mon Oct 11 18:31:36 2010 +0200
@@ -137,12 +137,15 @@
 Xend (the Xen daemon) has the following runtime dependencies:
 
     * Python 2.3 or later.
-      In many distros, the XML-aspects to the standard library
+      In some distros, the XML-aspects to the standard library
       (xml.dom.minidom etc) are broken out into a separate python-xml package.
       This is also required.
+      In more recent versions of Debian and Ubuntu the XML-aspects are included
+      in the base python package however (python-xml has been removed
+      from Debian in squeeze and from Ubuntu in intrepid).
 
           URL:    http://www.python.org/
-          Debian: python, python-xml
+          Debian: python
 
     * For optional SSL support, pyOpenSSL:
           URL:    http://pyopenssl.sourceforge.net/
@@ -153,8 +156,9 @@
           Debian: python-pam
 
     * For optional XenAPI support in XM, PyXML:
-          URL:    http://pyxml.sourceforge.net
-          YUM:    PyXML
+          URL:    http://codespeak.net/lxml/
+         Debian: python-lxml
+          YUM:    python-lxml
 
 
 Intel(R) Trusted Execution Technology Support

Attachment: lxml_validator.patch
Description: Text Data

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.