GNU wdiff の Pygments lexer

ワガランなぁと思いつつも。

練習に、と思ったが、全然練習にならなかった:

diff.py に参加させてみた
 1 # -*- coding: utf-8 -*-
 2 """
 3     pygments.lexers.diff
 4     ~~~~~~~~~~~~~~~~~~~~
 5 
 6     Lexers for diff/patch formats.
 7 
 8     :copyright: Copyright 2006-2015 by the Pygments team, see AUTHORS.
 9     :license: BSD, see LICENSE for details.
10 """
11 
12 import re
13 
14 from pygments.lexer import RegexLexer, include, bygroups
15 from pygments.token import Text, Comment, Operator, Keyword, Name, Generic, \
16     Literal
17 
18 __all__ = ['DiffLexer', 'DarcsPatchLexer', 'WDiffLexer']
19 
20 
21 # ...
22 
23 class WDiffLexer(RegexLexer):
24     """
25     A `wdiff <https://www.gnu.org/software/wdiff/>`_ lexer.
26 
27     Note that:
28 
29     * only to normal output (without option like -l).
30     * if target files of wdiff contain "[-", "-]", "{+", "+}",
31       especially they are unbalanced, this lexer will get confusing.
32 
33     .. versionadded:: 2.1
34     """
35 
36     name = 'WDiff'
37     aliases = ['wdiff',]
38     filenames = ['*.wdiff',]
39     mimetypes = []
40 
41     flags = re.MULTILINE | re.DOTALL
42 
43     # We can only assume "[-" after "[-" before "-]" is `nested`,
44     # for instance wdiff to wdiff outputs. We have no way to
45     # distinct these marker is of wdiff output from original text.
46 
47     ins_op = r"\{\+"
48     ins_cl = r"\+\}"
49     del_op = r"\[\-"
50     del_cl = r"\-\]"
51     tokens = {
52         'root': [
53             (ins_op, Generic.Inserted, 'inserted'),
54             (del_op, Generic.Deleted, 'deleted'),
55             (r'.', Text),
56         ],
57         'inserted': [
58             (ins_op, Generic.Inserted, '#push'),
59             (del_op, Generic.Inserted, '#push'),
60             (del_cl, Generic.Inserted, '#pop'),
61 
62             (ins_cl, Generic.Inserted, '#pop'),
63             (r'.', Generic.Inserted),
64         ],
65         'deleted': [
66             (del_op, Generic.Deleted, '#push'),
67             (ins_op, Generic.Deleted, '#push'),
68             (ins_cl, Generic.Deleted, '#pop'),
69 
70             (del_cl, Generic.Deleted, '#pop'),
71             (r'.', Generic.Deleted),
72         ],
73     }

コメントの英語が読める人は苦悩を察して欲しい。けれど、「割り切っちゃえば」ひとっつも難しいことはなくて、死にそうに簡単だった。てか20分以内で出来上がってしまった。なんとも張り合いのない。強いていえば flags のデフォルトについて確証出来たことくらいかなぁ、学べたのは。

結果はこんなだよ:

まぁこうやって出来てうまくいってるものを見れば、欲しがる気持ちは良くわかる。ワタシには相変わらず嬉しくはないけれど。