ちょっとした実験をこの記事でやってみたい。
以下は、ワタシの脳内実況中継、である。
…「hl_lines で範囲による指定を許容したら凄く嬉しい」
…「hl_lines=”1-3, 5″とか」
…「hl_lines=”1-3, range(1, 10, 2)”とか」
…「入力は php でリストにしていたが、php では文字列のままにして、python でやった方が柔軟だ」
…「_parse_hl_linesが必要だな。」
…「まずは空っぽの _parse_hl_lines…」(ファイル名は zzz.py)
1 # -*- coding: utf-8 -*-
2 def _parse_hl_lines(fromform):
3 pass
…「渡したいものを渡しておく…」
1 # -*- coding: utf-8 -*-
2 def _parse_hl_lines(fromform):
3 pass
4
5 _parse_hl_lines("")
6 _parse_hl_lines("32 36-38")
7 _parse_hl_lines("32 36 - 38")
8 _parse_hl_lines("range(5) range(10, 15) range(20, 30, 2) 32 36-38")
9 _parse_hl_lines("range(5), range(10, 15), range(20, 30, 2), 32, 36-38")
…「区切り文字を空白・コンマ両方許容してるのが仇になってるなぁ…」
1 # -*- coding: utf-8 -*-
2 import re
3
4
5 def _parse_hl_lines(fromform):
6 print(re.sub(r'\s+', ' ', re.sub(r"\s+-\s+", "-", fromform)))
7
8 #_parse_hl_lines("")
9 _parse_hl_lines("32 36-38")
10 _parse_hl_lines("32 36 - 38")
11 _parse_hl_lines("range(5) range(10, 15) range(20, 30, 2) 32 36-38")
12 _parse_hl_lines("range(5), range(10, 15), range(20, 30, 2), 32, 36-38")
1 me@host: ~$ python zzz.py
2 32 36-38
3 32 36-38
4 range(5) range(10, 15) range(20, 30, 2) 32 36-38
5 range(5), range(10, 15), range(20, 30, 2), 32, 36-38
…「よし、ハイフンの両端の空白は消えた…」
…「ranges 方式とそうでないのを分けて処理すればよいのよね。まずは range たちを引っこ抜くか…」
1 # -*- coding: utf-8 -*-
2 import re
3
4
5 def _parse_hl_lines(fromform):
6 not_ranges = re.sub(r'\s+', ' ', re.sub(r"\s+-\s+", "-", fromform))
7 ranges = re.findall(r"\b(range\([^()]+\))", not_ranges)
8 if ranges:
9 print(ranges)
10
11
12 _parse_hl_lines("")
13 _parse_hl_lines("32 36-38")
14 _parse_hl_lines("32 36 - 38")
15 _parse_hl_lines("range(5) range(10, 15) range(20, 30, 2) 32 36-38")
16 _parse_hl_lines("range(5), range(10, 15), range(20, 30, 2), 32, 36-38")
1 me@host: ~$ python zzz.py
2 ['range(5)', 'range(10, 15)', 'range(20, 30, 2)']
3 ['range(5)', 'range(10, 15)', 'range(20, 30, 2)']
…「拾った range たちを、「以外」を引っこ抜くための正規表現にしたいわけだから…」
1 >>> ranges = ['range(5)', 'range(10, 15)', 'range(20, 30, 2)']
2 >>> "|".join(ranges)
3 'range(5)|range(10, 15)|range(20, 30, 2)'
…「うぅ、正規表現だから括弧をエスケープせんといかんか…」
1 >>> import re
2 >>> re.sub(r"([()])", r"\\\1", "|".join(ranges))
3 'range\\(5\\)|range\\(10, 15\\)|range\\(20, 30, 2\\)'
4 >>> "(%s)" % re.sub(r"([()])", r"\\\1", "|".join(ranges))
5 '(range\\(5\\)|range\\(10, 15\\)|range\\(20, 30, 2\\))'
…「よし、これを元の入力文字列fromformから「not_ranges」引っこ抜けるな…」(スクリプトに戻って…)
1 # -*- coding: utf-8 -*-
2 import re
3
4
5 def _parse_hl_lines(fromform):
6 not_ranges = re.sub(r'\s+', ' ', re.sub(r"\s+-\s+", "-", fromform))
7 ranges = re.findall(r"\b(range\([^()]+\))", not_ranges)
8 if ranges:
9 not_ranges = re.sub(
10 "(%s)" % re.sub(r"([()])", r"\\\1", "|".join(ranges)),
11 "", fromform)
12 print(not_ranges)
13
14
15 #_parse_hl_lines("")
16 _parse_hl_lines("32 36-38")
17 _parse_hl_lines("32 36 - 38")
18 _parse_hl_lines("range(5) range(10, 15) range(20, 30, 2) 32 36-38")
19 _parse_hl_lines("range(5), range(10, 15), range(20, 30, 2), 32, 36-38")
1 me@host: ~$ python zzz.py
2 32 36-38
3 , , , 32, 36-38
…「not_ranges はこれでいい。ranges は「range」な文字列と括弧はいらんな…」
1 # -*- coding: utf-8 -*-
2 import re
3
4
5 def _parse_hl_lines(fromform):
6 not_ranges = re.sub(r'\s+', ' ', re.sub(r"\s+-\s+", "-", fromform))
7 ranges = re.findall(r"\b(range\([^()]+\))", not_ranges)
8 if ranges:
9 not_ranges = re.sub(
10 "(%s)" % re.sub(r"([()])", r"\\\1", "|".join(ranges)),
11 "", fromform)
12 ranges = [re.sub(r"\brange\(([^()]+)\)", r"\1", r) for r in ranges]
13 print(ranges)
14 print(not_ranges)
15
16
17 _parse_hl_lines("")
18 _parse_hl_lines("32 36-38")
19 _parse_hl_lines("32 36 - 38")
20 _parse_hl_lines("range(5) range(10, 15) range(20, 30, 2) 32 36-38")
21 _parse_hl_lines("range(5), range(10, 15), range(20, 30, 2), 32, 36-38")
1 me@host: ~$ python zzz.py
2 ['5', '10, 15', '20, 30, 2']
3 32 36-38
4 ['5', '10, 15', '20, 30, 2']
5 , , , 32, 36-38
…「ranges, not_ranges のループ本体だけ書いとくか…」
1 # -*- coding: utf-8 -*-
2 import re
3
4
5 def _parse_hl_lines(fromform):
6 not_ranges = re.sub(r'\s+', ' ', re.sub(r"\s+-\s+", "-", fromform))
7 ranges = re.findall(r"\b(range\([^()]+\))", not_ranges)
8 if ranges:
9 not_ranges = re.sub(
10 "(%s)" % re.sub(r"([()])", r"\\\1", "|".join(ranges)),
11 "", fromform)
12 ranges = [re.sub(r"\brange\(([^()]+)\)", r"\1", r) for r in ranges]
13
14 result = []
15 for s in re.split(r"[\s,]+", not_ranges):
16 if not s.strip():
17 continue
18 print(s)
19 for sl in ranges:
20 print(sl)
21 return result
22
23 _parse_hl_lines("")
24 _parse_hl_lines("32 36-38")
25 _parse_hl_lines("32 36 - 38")
26 _parse_hl_lines("range(5) range(10, 15) range(20, 30, 2) 32 36-38")
27 _parse_hl_lines("range(5), range(10, 15), range(20, 30, 2), 32, 36-38")
1 me@host: ~$ python zzz.py
2 32
3 36-38
4 32
5 36-38
6 32
7 36-38
8 5
9 10, 15
10 20, 30, 2
11 32
12 36-38
13 5
14 10, 15
15 20, 30, 2
…「こっからは doctest 書いた方が楽だな…」
1 # -*- coding: utf-8 -*-
2 import re
3
4
5 def _parse_hl_lines(fromform):
6 """
7 >>> _parse_hl_lines("")
8 []
9 >>> _parse_hl_lines("32 36-38")
10 [32, 36, 37, 38]
11 >>> _parse_hl_lines("32 36 - 38")
12 [32, 36, 37, 38]
13 >>> _parse_hl_lines("range(5) range(10, 15) range(20, 30, 2) 32 36-38")
14 [0, 1, 2, 3, 4, 10, 11, 12, 13, 14, 20, 22, 24, 26, 28, 32, 36, 37, 38]
15 >>> _parse_hl_lines("range(5), range(10, 15), range(20, 30, 2), 32, 36-38")
16 [0, 1, 2, 3, 4, 10, 11, 12, 13, 14, 20, 22, 24, 26, 28, 32, 36, 37, 38]
17 """
18 not_ranges = re.sub(r'\s+', ' ', re.sub(r"\s+-\s+", "-", fromform))
19 ranges = re.findall(r"\b(range\([^()]+\))", not_ranges)
20 if ranges:
21 not_ranges = re.sub(
22 "(%s)" % re.sub(r"([()])", r"\\\1", "|".join(ranges)),
23 "", fromform)
24 ranges = [re.sub(r"\brange\(([^()]+)\)", r"\1", r) for r in ranges]
25
26 result = []
27 for s in re.split(r"[\s,]+", not_ranges):
28 if not s.strip():
29 continue
30 print(s)
31 for sl in ranges:
32 print(sl)
33 return result
34
35
36 if __name__ == '__main__':
37 import doctest
38 doctest.testmod()
1 me@host: ~$ python zzz.py
2 **********************************************************************
3 File "zzz.py", line 9, in __main__._parse_hl_lines
4 Failed example:
5 _parse_hl_lines("32 36-38")
6 Expected:
7 [32, 36, 37, 38]
8 Got:
9 32
10 36-38
11 []
12 **********************************************************************
13 File "zzz.py", line 11, in __main__._parse_hl_lines
14 Failed example:
15 _parse_hl_lines("32 36 - 38")
16 Expected:
17 [32, 36, 37, 38]
18 Got:
19 32
20 36-38
21 []
22 **********************************************************************
23 File "zzz.py", line 13, in __main__._parse_hl_lines
24 Failed example:
25 _parse_hl_lines("range(5) range(10, 15) range(20, 30, 2) 32 36-38")
26 Expected:
27 [0, 1, 2, 3, 4, 10, 11, 12, 13, 14, 20, 22, 24, 26, 28, 32, 36, 37, 38]
28 Got:
29 32
30 36-38
31 5
32 10, 15
33 20, 30, 2
34 []
35 **********************************************************************
36 File "zzz.py", line 15, in __main__._parse_hl_lines
37 Failed example:
38 _parse_hl_lines("range(5), range(10, 15), range(20, 30, 2), 32, 36-38")
39 Expected:
40 [0, 1, 2, 3, 4, 10, 11, 12, 13, 14, 20, 22, 24, 26, 28, 32, 36, 37, 38]
41 Got:
42 32
43 36-38
44 5
45 10, 15
46 20, 30, 2
47 []
48 **********************************************************************
49 1 items had failures:
50 4 of 5 in __main__._parse_hl_lines
51 ***Test Failed*** 4 failures.
…「あぁうるせ、print いらね…」
1 # -*- coding: utf-8 -*-
2 import re
3
4
5 def _parse_hl_lines(fromform):
6 """
7 >>> _parse_hl_lines("")
8 []
9 >>> _parse_hl_lines("32 36-38")
10 [32, 36, 37, 38]
11 >>> _parse_hl_lines("32 36 - 38")
12 [32, 36, 37, 38]
13 >>> _parse_hl_lines("range(5) range(10, 15) range(20, 30, 2) 32 36-38")
14 [0, 1, 2, 3, 4, 10, 11, 12, 13, 14, 20, 22, 24, 26, 28, 32, 36, 37, 38]
15 >>> _parse_hl_lines("range(5), range(10, 15), range(20, 30, 2), 32, 36-38")
16 [0, 1, 2, 3, 4, 10, 11, 12, 13, 14, 20, 22, 24, 26, 28, 32, 36, 37, 38]
17 """
18 not_ranges = re.sub(r'\s+', ' ', re.sub(r"\s+-\s+", "-", fromform))
19 ranges = re.findall(r"\b(range\([^()]+\))", not_ranges)
20 if ranges:
21 not_ranges = re.sub(
22 "(%s)" % re.sub(r"([()])", r"\\\1", "|".join(ranges)),
23 "", fromform)
24 ranges = [re.sub(r"\brange\(([^()]+)\)", r"\1", r) for r in ranges]
25
26 result = []
27 for s in re.split(r"[\s,]+", not_ranges):
28 if not s.strip():
29 continue
30 for sl in ranges:
31 pass
32 return result
33
34
35 if __name__ == '__main__':
36 import doctest
37 doctest.testmod()
1 me@host: ~$ python zzz.py
2 **********************************************************************
3 File "zzz.py", line 9, in __main__._parse_hl_lines
4 Failed example:
5 _parse_hl_lines("32 36-38")
6 Expected:
7 [32, 36, 37, 38]
8 Got:
9 []
10 **********************************************************************
11 File "zzz.py", line 11, in __main__._parse_hl_lines
12 Failed example:
13 _parse_hl_lines("32 36 - 38")
14 Expected:
15 [32, 36, 37, 38]
16 Got:
17 []
18 **********************************************************************
19 File "zzz.py", line 13, in __main__._parse_hl_lines
20 Failed example:
21 _parse_hl_lines("range(5) range(10, 15) range(20, 30, 2) 32 36-38")
22 Expected:
23 [0, 1, 2, 3, 4, 10, 11, 12, 13, 14, 20, 22, 24, 26, 28, 32, 36, 37, 38]
24 Got:
25 []
26 **********************************************************************
27 File "zzz.py", line 15, in __main__._parse_hl_lines
28 Failed example:
29 _parse_hl_lines("range(5), range(10, 15), range(20, 30, 2), 32, 36-38")
30 Expected:
31 [0, 1, 2, 3, 4, 10, 11, 12, 13, 14, 20, 22, 24, 26, 28, 32, 36, 37, 38]
32 Got:
33 []
34 **********************************************************************
35 1 items had failures:
36 4 of 5 in __main__._parse_hl_lines
37 ***Test Failed*** 4 failures.
…「not_ranges は int にする、ハイフン含む方は range で実現出来る…」
1 # -*- coding: utf-8 -*-
2 import re
3
4
5 def _parse_hl_lines(fromform):
6 """
7 >>> _parse_hl_lines("")
8 []
9 >>> _parse_hl_lines("32 36-38")
10 [32, 36, 37, 38]
11 >>> _parse_hl_lines("32 36 - 38")
12 [32, 36, 37, 38]
13 >>> _parse_hl_lines("range(5) range(10, 15) range(20, 30, 2) 32 36-38")
14 [0, 1, 2, 3, 4, 10, 11, 12, 13, 14, 20, 22, 24, 26, 28, 32, 36, 37, 38]
15 >>> _parse_hl_lines("range(5), range(10, 15), range(20, 30, 2), 32, 36-38")
16 [0, 1, 2, 3, 4, 10, 11, 12, 13, 14, 20, 22, 24, 26, 28, 32, 36, 37, 38]
17 """
18 not_ranges = re.sub(r'\s+', ' ', re.sub(r"\s+-\s+", "-", fromform))
19 ranges = re.findall(r"\b(range\([^()]+\))", not_ranges)
20 if ranges:
21 not_ranges = re.sub(
22 "(%s)" % re.sub(r"([()])", r"\\\1", "|".join(ranges)),
23 "", fromform)
24 ranges = [re.sub(r"\brange\(([^()]+)\)", r"\1", r) for r in ranges]
25
26 result = []
27 for s in re.split(r"[\s,]+", not_ranges):
28 if not s.strip():
29 continue
30 if "-" in s:
31 spl = s.split("-")
32 result.extend(range(int(spl[0]), int(spl[1]) + 1))
33 else:
34 result.append(int(s))
35 for sl in ranges:
36 pass
37 return result
38
39
40 if __name__ == '__main__':
41 import doctest
42 doctest.testmod()
1 me@host: ~$ python zzz.py
2 **********************************************************************
3 File "zzz.py", line 13, in __main__._parse_hl_lines
4 Failed example:
5 _parse_hl_lines("range(5) range(10, 15) range(20, 30, 2) 32 36-38")
6 Expected:
7 [0, 1, 2, 3, 4, 10, 11, 12, 13, 14, 20, 22, 24, 26, 28, 32, 36, 37, 38]
8 Got:
9 [32, 36, 37, 38]
10 **********************************************************************
11 File "zzz.py", line 15, in __main__._parse_hl_lines
12 Failed example:
13 _parse_hl_lines("range(5), range(10, 15), range(20, 30, 2), 32, 36-38")
14 Expected:
15 [0, 1, 2, 3, 4, 10, 11, 12, 13, 14, 20, 22, 24, 26, 28, 32, 36, 37, 38]
16 Got:
17 [32, 36, 37, 38]
18 **********************************************************************
19 1 items had failures:
20 2 of 5 in __main__._parse_hl_lines
21 ***Test Failed*** 2 failures.
…「ranges は range にそのまま渡せるはずだよな…」
1 >>> range(*map(int, ["5"]))
2 [0, 1, 2, 3, 4]
3 >>> range(*map(int, ["10, 15"]))
4 Traceback (most recent call last):
5 File "<stdin>", line 1, in <module>
6 ValueError: invalid literal for int() with base 10: '10, 15'
7 >>> range(*map(int, ["10, 15"].split(",")))
8 Traceback (most recent call last):
9 File "<stdin>", line 1, in <module>
10 AttributeError: 'list' object has no attribute 'split'
11 >>> range(*map(int, [10, 15".split(",")))
12 File "<stdin>", line 1
13 range(*map(int, [10, 15".split(",")))
14 ^
15 SyntaxError: invalid syntax
16 >>> range(*map(int, "10, 15".split(",")))
17 [10, 11, 12, 13, 14]
18 >>> range(*map(int, "20, 30, 2".split(",")))
19 [20, 22, 24, 26, 28]
…「のでこうだな…」
1 # -*- coding: utf-8 -*-
2 import re
3
4
5 def _parse_hl_lines(fromform):
6 """
7 >>> _parse_hl_lines("")
8 []
9 >>> _parse_hl_lines("32 36-38")
10 [32, 36, 37, 38]
11 >>> _parse_hl_lines("32 36 - 38")
12 [32, 36, 37, 38]
13 >>> _parse_hl_lines("range(5) range(10, 15) range(20, 30, 2) 32 36-38")
14 [0, 1, 2, 3, 4, 10, 11, 12, 13, 14, 20, 22, 24, 26, 28, 32, 36, 37, 38]
15 >>> _parse_hl_lines("range(5), range(10, 15), range(20, 30, 2), 32, 36-38")
16 [0, 1, 2, 3, 4, 10, 11, 12, 13, 14, 20, 22, 24, 26, 28, 32, 36, 37, 38]
17 """
18 not_ranges = re.sub(r'\s+', ' ', re.sub(r"\s+-\s+", "-", fromform))
19 ranges = re.findall(r"\b(range\([^()]+\))", not_ranges)
20 if ranges:
21 not_ranges = re.sub(
22 "(%s)" % re.sub(r"([()])", r"\\\1", "|".join(ranges)),
23 "", fromform)
24 ranges = [re.sub(r"\brange\(([^()]+)\)", r"\1", r) for r in ranges]
25
26 result = []
27 for s in re.split(r"[\s,]+", not_ranges):
28 if not s.strip():
29 continue
30 if "-" in s:
31 spl = s.split("-")
32 result.extend(range(int(spl[0]), int(spl[1]) + 1))
33 else:
34 result.append(int(s))
35 for sl in ranges:
36 result.extend(range(*map(int, sl.split(","))))
37 return result
38
39
40 if __name__ == '__main__':
41 import doctest
42 doctest.testmod()
1 me@host: ~$ python zzz.py
2 **********************************************************************
3 File "zzz.py", line 13, in __main__._parse_hl_lines
4 Failed example:
5 _parse_hl_lines("range(5) range(10, 15) range(20, 30, 2) 32 36-38")
6 Expected:
7 [0, 1, 2, 3, 4, 10, 11, 12, 13, 14, 20, 22, 24, 26, 28, 32, 36, 37, 38]
8 Got:
9 [32, 36, 37, 38, 0, 1, 2, 3, 4, 10, 11, 12, 13, 14, 20, 22, 24, 26, 28]
10 **********************************************************************
11 File "zzz.py", line 15, in __main__._parse_hl_lines
12 Failed example:
13 _parse_hl_lines("range(5), range(10, 15), range(20, 30, 2), 32, 36-38")
14 Expected:
15 [0, 1, 2, 3, 4, 10, 11, 12, 13, 14, 20, 22, 24, 26, 28, 32, 36, 37, 38]
16 Got:
17 [32, 36, 37, 38, 0, 1, 2, 3, 4, 10, 11, 12, 13, 14, 20, 22, 24, 26, 28]
18 **********************************************************************
19 1 items had failures:
20 2 of 5 in __main__._parse_hl_lines
21 ***Test Failed*** 2 failures.
…「よし、あとは並べ替えればよさげだな…」
1 # -*- coding: utf-8 -*-
2 import re
3
4
5 def _parse_hl_lines(fromform):
6 """
7 >>> _parse_hl_lines("")
8 []
9 >>> _parse_hl_lines("32 36-38")
10 [32, 36, 37, 38]
11 >>> _parse_hl_lines("32 36 - 38")
12 [32, 36, 37, 38]
13 >>> _parse_hl_lines("range(5) range(10, 15) range(20, 30, 2) 32 36-38")
14 [0, 1, 2, 3, 4, 10, 11, 12, 13, 14, 20, 22, 24, 26, 28, 32, 36, 37, 38]
15 >>> _parse_hl_lines("range(5), range(10, 15), range(20, 30, 2), 32, 36-38")
16 [0, 1, 2, 3, 4, 10, 11, 12, 13, 14, 20, 22, 24, 26, 28, 32, 36, 37, 38]
17 """
18 not_ranges = re.sub(r'\s+', ' ', re.sub(r"\s+-\s+", "-", fromform))
19 ranges = re.findall(r"\b(range\([^()]+\))", not_ranges)
20 if ranges:
21 not_ranges = re.sub(
22 "(%s)" % re.sub(r"([()])", r"\\\1", "|".join(ranges)),
23 "", fromform)
24 ranges = [re.sub(r"\brange\(([^()]+)\)", r"\1", r) for r in ranges]
25
26 result = []
27 for s in re.split(r"[\s,]+", not_ranges):
28 if not s.strip():
29 continue
30 if "-" in s:
31 spl = s.split("-")
32 result.extend(range(int(spl[0]), int(spl[1]) + 1))
33 else:
34 result.append(int(s))
35 for sl in ranges:
36 result.extend(range(*map(int, sl.split(","))))
37 result = list(set(result))
38 result.sort()
39 return result
40
41
42 if __name__ == '__main__':
43 import doctest
44 doctest.testmod()
1 me@host: ~$ python zzz.py
2 me@host: ~$
脳内実況中継、おしまい。
さて、ここで質問です。
この記事、何分で読めた?
おそらく、5分、てとこだと思います。
この記事は30分くらいで書きました。
もう一つ。この「_parse_hl_lines」は、実際にはどのくらいの時間で書いたか、想像出来ます?
答えは20分くらい。
さてさて。「何が実験」でしたでしょうか。それは秘密です。ではあんまりなので…。
ソフトウェア開発者として、色んなスキルレベルの人たちに、まぁ出会ってきたのね。ずっと感じてたのはさ、この「リズム感」「スピード感」みたいな「感覚的なもの」の差がね、ビックリするほど格差があるんだ。「出来る人」と「出来ない人」には救いようもないほどの開きがあるのに、どうにも「出来ない人」側がこのことにこそ気付いていないことが、あまりに多い気がしてさ。例えば、さっきの「20分くらいで書いた」だけど、「スクリプトも書いて」「対話モード起動して」「doctest()も書いて」「doctest()実行して」という一連のサイクルを、一気呵成にやってしまっているわけだけれど、「出来ない人」サイドの人々がそのことに気付くこともなく、まさかこっちが「そこまでやってると思わない」。
これってのはさぁ、結局のところ、「出来る人を疑似体験する」ということが出来ないから、なのだと思うんだわね。そりゃそーだ、他人なんだから。
であるならば。「どうやったら疑似体験の場を多く作れるだろうか?」という問いに行き着く。Pygments での「コンソールセッションのハイライト」とか、「動画・画像で伝える」といったことにワタシが執着してるのは、そんな理由があるのだす。