popexizhi: Dive into Python 的我的翻译 XXVI------8.7 II

Example 8.16. Quoting attribute values P113

[原文]P113
>>> htmlSource = """--------------------------------------------------1
...
...
...     Test page
...

...
...

...

Home

...

Table of contents

...

Revision history

...
...
...     """
>>> from BaseHTMLProcessor import BaseHTMLProcessor
>>> parser = BaseHTMLProcessor()
>>> parser.feed(htmlSource)---------------------------------------------2
>>> print parser.output()-----------------------------------------------3

Test page

Home

Table of contents

Revision history

[原文]P114
1
Note that the attribute values of the href attributes in the tags are not properly quoted. (Also note that you're using triple quotes for something other than a doc string. And directly in the IDE, no less. They're very useful.)
2
Feed the parser.
3
Using the output function defined in BaseHTMLProcessor, you get the output as a single string, complete with quoted attribute values. While this may seem anti−climactic, think about how much has actually happened here: SGMLParser parsed the entire HTML document, breaking it down into tags, refs, data, and so forth; BaseHTMLProcessor used those elements to reconstruct pieces of HTML (which are still stored in parser.pieces, if you want to see them); finally, you called parser.output, which joined all the pieces of HTML into one string.
----------------------------------------------------------------------

[pope改进译]
1
注意标签中href 属性值没有用双引号引起来（同时应该注意到在这段字符比文档字符串，多使用三对双引号。在IDE中直接使用{directly},是不可缺少的{no less}。他们很有用的）
2.
填入字符用来分析
3.
使用BaseHTMLProcessor定义的输出函数，你可以输出包含引号引用属性值的原html字符串。这样看起来没什么激动的{anti-climactic},但想想这里发生了多少内容:SGMLParser 分析HTML document的语法,将其按照tags，refs ，data等等划分;BaseHTMLProcessor 使用这些元素重组{reconstruct pieces of}了HTML(如果你敢兴趣看，这些仍然存储在parser.pieces中);最后，你使用parser.output,将所以HTML片段组装成一个字符串。

--------------------------------------------------------------------

[pope译]
1
注意标签中href 属性值没有用双引号引起来（同时应该注意到在这段字符比文档字符串，多使用三对双引号。在IDE中直接使用{directly},是不可缺少的{no less}。他们很有用的）
2.
填入字符用来分析
3.
使用BaseHTMLProcessor定义的输出函数，你可以输出包含引号引用属性值的原html字符串。这样看起来没什么激动的{anti-climactic},想想这里发生了多少内容:SGMLParser 分析HTML document的语法,将其按照tags，refs ，data等等划分;BaseHTMLProcessor 使用这些元素重组{reconstruct pieces of}了HTML(如果你敢兴趣看，这些仍然存储在parser.pieces中);最后，你使用parser.output,将所以HTML片段组装成一个字符串。

【doing】 try print parser.pieces
[net 译来源:http://woodpecker.org.cn/diveintopython/html_processing/quoting_attribute_values.html]
1.请注意，在标记中的 href 属性值没有被适当地括起来 (还要注意，除了文档字符串之外，我们还将三重引号用到了 doc string 之外的其它地方，并且是不会少于直接在 IDE 中的使用。它们非常有用。)
2.装填分析器。
3.使用定义在 BaseHTMLProcessor 中的 output 函数，我们得到单个字符串的输出，并且属性值被完全括起来了。让我们想一下这里实际上发生了多少事：SGMLParser 分析整个 HTML 文档，将其分解为一片片的标记、引用、数据等等。BaseHTMLProcessor 使用这些元素来重新构造 HTML 的片段 (如果您想查看的话它们仍然保存在 parser.pieces 中) 。最后，我们调用 parser.output，它将所有的 HTML 片段连接成一个字符串。

[popexizhi]
And directly in the IDE, no less
[pope译]在IDE中直接使用,是不可缺少的
[net 译]并且不会少于直接在 IDE 中的使用
[popexizhi:不好说no less 让自己这样翻译是不是有歧义，但是net译的这里至少很不和自己的口味:)]

While this may seem anti−climactic, think about how much has actually happened here
[pope译]这样看起来没什么激动的{anti-climactic},想想这里发生了多少内容:
[net 译]让我们想一下这里实际上发生了多少事
[popexizhi:net译将while 忽略了，但是pope 改进一下为“这样看起来没什么激动的{anti-climactic},但想想这里发生了多少内容”应该可以更好的表达这个其实细想其实 anti-climactic 的事实，feel 程序认识这段xml是的:)]

【doing try print parser.pieces
>>> print parser.pieces
['\n', '', '\n', '', '\n', '', 'Test page', '', '\n', '', '\n', '', '\n', '
- ', '', 'Home', '', '
- ', '', 'Table of contents', '', '
- ', '', 'Revision history', '', '
', '\n', '', '\n\n']
[popexizhi:parser.pieces 是一个[ ],ok ]
】

popexizhi

html tool

2012年12月12日星期三

Dive into Python 的我的翻译 XXVI------8.7 II

没有评论:

发表评论