a html parse problem
Available news archives: comp.lang.tcl - comp.lang.python - comp.security.firewalls - sci.crypt - comp.lang.php - comp.lang.javascript
Google
 
Web news.hping.org


comp.lang.python archive

a html parse problem

From: cheng <magicmas@spymac.com>
Date: Fri May 27 2005 - 15:42:06 CEST

hi,all

if the html like:
 <meta name = "description" content = "a test page">
 <meta name = "keywords" content = "keyword1 keyword2">

if i use:
    def handle_starttag(self, tag, attrs):
        if tag == 'meta':
           self.attr = attrs
        self.headers += ['%s' % (self.attr)]
        self.attr = ''

will get the output:
[('name', 'description'), ('content', 'a test page')]

[('name', 'keywords'), ('content', 'keyword1 keyword2')]

is it some way that only take the content like " a test page, keyword1
, keywork2"
Received on Thu Sep 29 16:14:59 2005