过滤器¶
0.7 新版功能.
将一个形符流转化为另一个流被称为 “过滤”,由过滤器来完成。最常见的过滤器的例子是通过应用简单的规则来转换每个形符,如高亮形符,如果它是一个 TODO 或其他特殊的词,或将关键字转换为大写字母以执行风格指南。更复杂的过滤器可以改变形符流,例如删除行的形符或将形符合并在一起。应该注意的是,pygments 过滤器与 Python 的 filter 完全没有关系。
可以将任意数量的过滤器应用于来自词库的标记流,以改善或注释输出。要应用一个过滤器,你可以使用 lexer 的 add_filter() 方法:
>>> from pygments.lexers import PythonLexer
>>> l = PythonLexer()
>>> # add a filter given by a string and options
>>> l.add_filter('codetagify', case='lower')
>>> l.filters
[<pygments.filters.CodeTagFilter object at 0xb785decc>]
>>> from pygments.filters import KeywordCaseFilter
>>> # or give an instance
>>> l.add_filter(KeywordCaseFilter(case='lower'))
add_filter() 方法需要关键字参数,这些参数被转发给过滤器的构造函数。
要想按名称获得所有已注册过滤器的列表,可以使用 pygments.filters 模块中的 get_all_filters() 函数,该函数为所有已知的过滤器返回一个可迭代的列表。
如果你想写你自己的过滤器,请看 Write your own filter。
内置过滤器¶
- class CodeTagFilter¶
- 名称
codetagify
Highlight special code tags in comments and docstrings.
Options accepted:
- codetagslist of strings
A list of strings that are flagged as code tags. The default is to highlight
XXX
,TODO
,BUG
andNOTE
.
- class KeywordCaseFilter¶
- 名称
keywordcase
Convert keywords to lowercase or uppercase or capitalize them, which means first letter uppercase, rest lowercase.
This can be useful e.g. if you highlight Pascal code and want to adapt the code to your styleguide.
Options accepted:
- casestring
The casing to convert keywords to. Must be one of
'lower'
,'upper'
or'capitalize'
. The default is'lower'
.
- class NameHighlightFilter¶
- 名称
highlight
Highlight a normal Name (and Name.*) token with a different token type.
Example:
filter = NameHighlightFilter( names=['foo', 'bar', 'baz'], tokentype=Name.Function, )
This would highlight the names “foo”, “bar” and “baz” as functions. Name.Function is the default token type.
Options accepted:
- nameslist of strings
A list of names that should be given the different token type. There is no default.
- tokentypeTokenType or string
A token type or a string containing a token type name that is used for highlighting the strings in names. The default is Name.Function.
- class RaiseOnErrorTokenFilter¶
- 名称
raiseonerror
Raise an exception when the lexer generates an error token.
Options accepted:
- excclassException class
The exception class to raise. The default is pygments.filters.ErrorToken.
0.8 新版功能.
- class VisibleWhitespaceFilter¶
- 名称
whitespace
Convert tabs, newlines and/or spaces to visible characters.
Options accepted:
- spacesstring or bool
If this is a one-character string, spaces will be replaces by this string. If it is another true value, spaces will be replaced by
·
(unicode MIDDLE DOT). If it is a false value, spaces will not be replaced. The default isFalse
.- tabsstring or bool
The same as for spaces, but the default replacement character is
»
(unicode RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK). The default value isFalse
. Note: this will not work if the tabsize option for the lexer is nonzero, as tabs will already have been expanded then.- tabsizeint
If tabs are to be replaced by this filter (see the tabs option), this is the total number of characters that a tab should be expanded to. The default is
8
.- newlinesstring or bool
The same as for spaces, but the default replacement character is
¶
(unicode PILCROW SIGN). The default value isFalse
.- wstokentypebool
If true, give whitespace the special Whitespace token type. This allows styling the visible whitespace differently (e.g. greyed out), but it can disrupt background colors. The default is
True
.
0.8 新版功能.
- class GobbleFilter¶
- 名称
gobble
Gobbles source code lines (eats initial characters).
This filter drops the first
n
characters off every line of code. This may be useful when the source code fed to the lexer is indented by a fixed amount of space that isn’t desired in the output.Options accepted:
- nint
The number of characters to gobble.
1.2 新版功能.
- class TokenMergeFilter¶
- 名称
tokenmerge
Merges consecutive tokens with the same token type in the output stream of a lexer.
1.2 新版功能.
- class SymbolFilter¶
- 名称
symbols
Convert mathematical symbols such as <longrightarrow> in Isabelle or longrightarrow in LaTeX into Unicode characters.
This is mostly useful for HTML or console output when you want to approximate the source rendering you’d see in an IDE.
Options accepted:
- langstring
The symbol language. Must be one of
'isabelle'
or'latex'
. The default is'isabelle'
.