[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] [PATCH] Replace pyxml/xmlproc-based XML validator with lxml based one.


  • To: xen-devel@xxxxxxxxxxxxxxxxxxx
  • From: Stephan Peijnik <spe@xxxxxxxxx>
  • Date: Fri, 08 Oct 2010 12:52:33 +0200
  • Delivery-date: Fri, 08 Oct 2010 09:16:45 -0700
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>

Pyxml/xmlproc is being used in tools/xen/xm/xenapi_create.py but is
unmaintained for several years now. xmlproc is used only for validating
XML documents against a DTD file. 

This patch replaces the pyxml/xmlproc based XML validation with code
based on lxml, which is actively maintained.

Signed-off-by: Stephan Peijnik <spe@xxxxxxxxx>

diff -r 6e0ffcd2d9e0 -r 7082ce86e492
tools/python/xen/xm/xenapi_create.py
--- a/tools/python/xen/xm/xenapi_create.py      Fri Sep 17 17:06:57 2010
+0100
+++ b/tools/python/xen/xm/xenapi_create.py      Fri Oct 08 12:31:18 2010
+0200
@@ -14,13 +14,15 @@
 # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307
USA

#============================================================================
 # Copyright (C) 2007 Tom Wilkie <tom.wilkie@xxxxxxxxx>
+# Copyright (C) 2010 ANEXIA Internetdienstleistungs GmbH
+#   Author: Stephan Peijnik <spe@xxxxxxxxx>

#============================================================================
 """Domain creation using new XenAPI
 """
 
 from xen.xm.main import server, get_default_SR
 from xml.dom.minidom import parse, getDOMImplementation
-from xml.parsers.xmlproc import xmlproc, xmlval, xmldtd
+from lxml import etree
 from xen.xend import sxp
 from xen.xend.XendAPIConstants import XEN_API_ON_NORMAL_EXIT, \
      XEN_API_ON_CRASH_BEHAVIOUR
@@ -35,6 +37,7 @@
 from os.path import join
 import traceback
 import re
+import warnings # Used by lxml-based validator
 
 def log(_, msg):
     #print "> " + msg
@@ -118,62 +121,58 @@
         Use this if possible as it gives nice
         error messages
         """
-        dtd = xmldtd.load_dtd(self.dtd)
-        parser = xmlproc.XMLProcessor()
-        parser.set_application(xmlval.ValidatingApp(dtd, parser))
-        parser.dtd = dtd
-        parser.ent = dtd
-        parser.parse_resource(file)
-
+        try:
+            dtd = etree.DTD(open(self.dtd, 'r'))
+        except IOError:
+            # The old code did neither raise an exception here, nor
+            # did it report an error. For now we issue a warning.
+            # TODO: How to handle a missing dtd file?
+            # --sp
+            warnings.warn('DTD file %s not found.' % (self.dtd),
+                          UserWarning)
+            return
+        
+        tree = etree.parse(file)
+        root = tree.getroot()
+        if not dtd.validate(root):
+            self.handle_dtd_errors(dtd)
+            
     def check_dom_against_dtd(self, dom):
         """
         Check DOM again DTD.
         Doesn't give as nice error messages.
         (no location info)
         """
-        dtd = xmldtd.load_dtd(self.dtd)
-        app = xmlval.ValidatingApp(dtd, self)
-        app.set_locator(self)
-        self.dom2sax(dom, app)
+        try:
+            dtd = etree.DTD(open(self.dtd, 'r'))
+        except IOError:
+            # The old code did neither raise an exception here, nor
+            # did it report an error. For now we issue a warning.
+            # TODO: How to handle a missing dtd file?
+            # --sp
+            warnings.warn('DTD file %s not found.' % (self.dtd),
+                          UserWarning)
+            return
 
-    # Get errors back from ValidatingApp       
-    def report_error(self, number, args=None):
-        self.errors = xmlproc.errors.english
-        try:
-            msg = self.errors[number]
-            if args != None:
-                msg = msg % args
-        except KeyError:
-            msg = self.errors[4002] % number # Unknown err msg :-)
-        print msg 
+        # XXX: This may be a bit slow. Maybe we should use another way
+        # of getting an etree root element from the minidom DOM tree...
+        # -- sp
+        root = etree.XML(dom.toxml())
+        if not dtd.validate(root):
+            self.handle_dtd_errors(dtd)
+
+    # Do the same that was done in report_error before. This is
directly
+    # called by check_dtd and check_dom_against_dtd.
+    # We are using sys.stderr instead of print though (python3k clean).
+    def handle_dtd_errors(self, dtd):
+        # XXX: Do we really want to bail out here?
+        # -- sp
+        for err in dtd.error_log:
+            err_str = 'ERROR: %s\n' % (str(err),)
+            sys.stderr.write(err_str)
+        sys.stderr.flush()
         sys.exit(-1)
 
-    # Here for compatibility with ValidatingApp
-    def get_line(self):
-        return -1
-
-    def get_column(self):
-        return -1
-
-    def dom2sax(self, dom, app):
-        """
-        Take a dom tree and tarverse it,
-        issuing SAX calls to app.
-        """
-        for child in dom.childNodes:
-            if child.nodeType == child.TEXT_NODE:
-                data = child.nodeValue
-                app.handle_data(data, 0, len(data))
-            else:
-                app.handle_start_tag(
-                    child.nodeName,
-                    self.attrs_to_dict(child.attributes))
-                self.dom2sax(child, app)
-                app.handle_end_tag(child.nodeName)
-
-    def attrs_to_dict(self, attrs):
-        return dict(attrs.items())     
-
     #
     # Checks which cannot be done with dtd
     #



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.