annotate ppgen.py @ 4:85c65a597420

Improves: command line options and code style. Adds a command line option for dumping the dictionary into a file. Improves source code style to better conform to pep8. Calls it BETA now.
author Bernhard Reiter <bernhard@intevation.de>
date Thu, 06 Oct 2016 17:28:46 +0200
parents 757625ec8364
children f8e24b2b6b6a
rev   line source
0
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
1 #!/usr/bin/env python3
4
85c65a597420 Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents: 3
diff changeset
2 """Create a random passphrase from a dictionary of words. BETA
0
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
3
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
4 Relies on the entropy of python's
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
5 random.SystemRandom class
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
6 which (according to the documentation) calls os.urandom()
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
7 which (according to the documentation) calls the operating system
4
85c65a597420 Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents: 3
diff changeset
8 specific randomness source which "should be unpredictable
0
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
9 enough for cryptographic applications"
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
10
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
11 Requires:
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
12 * Python v>=3.2
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
13 * a dictionary, Ding's trans-de-en by default.
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
14 E.g. on a Debian/Ubuntu system in package "trans-de-en".
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
15 or from http://ftp.tu-chemnitz.de/pub/Local/urz/ding/de-en/
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
16
1
00ed7df30fe4 Checking for 8k entries now. Comment improvements.
Bernhard Reiter <bernhard@intevation.de>
parents: 0
diff changeset
17 Uses a hardcoded filepath and language.
0
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
18 Search for **customize** below to change it.
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
19
3
757625ec8364 Comment added hint about SLT's Go implementation.
Bernhard Reiter <bernhard@intevation.de>
parents: 2
diff changeset
20 Related: There is a Go implementation started by Sascha L. Teichmann at
757625ec8364 Comment added hint about SLT's Go implementation.
Bernhard Reiter <bernhard@intevation.de>
parents: 2
diff changeset
21 https://bitbucket.org/s_l_teichmann/ppgen
757625ec8364 Comment added hint about SLT's Go implementation.
Bernhard Reiter <bernhard@intevation.de>
parents: 2
diff changeset
22
757625ec8364 Comment added hint about SLT's Go implementation.
Bernhard Reiter <bernhard@intevation.de>
parents: 2
diff changeset
23
0
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
24 Copyright 2016 by Intevation GmbH.
1
00ed7df30fe4 Checking for 8k entries now. Comment improvements.
Bernhard Reiter <bernhard@intevation.de>
parents: 0
diff changeset
25 Author: Bernhard E. Reiter <bernhard@intevation.de>
0
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
26
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
27 This file is Free Software under the Apache 2.0 license and thus
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
28 comes without any warranty (to extend permissible under applicable law).
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
29 """
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
30
4
85c65a597420 Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents: 3
diff changeset
31 import argparse
0
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
32 import math
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
33 import re
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
34 import sys
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
35
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
36 from random import SystemRandom
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
37 _srandom = SystemRandom()
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
38
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
39 tainted = False # to be set if we find a hint that the passphrase may be weak
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
40
4
85c65a597420 Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents: 3
diff changeset
41
85c65a597420 Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents: 3
diff changeset
42 def buildDictionary(options):
0
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
43 """Build up a dictionary of unique words, calculate stats."""
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
44 global tainted
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
45 d = []
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
46
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
47 # dictionary for testing
4
85c65a597420 Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents: 3
diff changeset
48 #d = ["abc", "aBc", "cde", "efg", "hij", "blubber",
85c65a597420 Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents: 3
diff changeset
49 # "jikf", "zug", "lmf", "opq"]
2
a099246680ae Fix for the unique test.
Bernhard Reiter <bernhard@intevation.de>
parents: 1
diff changeset
50 # second test dictionary to show that different string functions are used.
a099246680ae Fix for the unique test.
Bernhard Reiter <bernhard@intevation.de>
parents: 1
diff changeset
51 #d = [''.join('A' * 1000) for _ in range(1000)]
0
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
52
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
53 # Using the dictionary from Ding **customize**
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
54 d = readDingDict(filename="/usr/share/trans/de-en", useLeft=True)
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
55
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
56 ## for debugging purpuses, dump dictionary
4
85c65a597420 Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents: 3
diff changeset
57 if options.ddump_filename:
85c65a597420 Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents: 3
diff changeset
58 print("Writing out dictionary in '{}'.".format(options.ddump_filename))
85c65a597420 Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents: 3
diff changeset
59 with open(options.ddump_filename, "w") as f:
85c65a597420 Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents: 3
diff changeset
60 for i in d:
85c65a597420 Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents: 3
diff changeset
61 f.write("{}\n".format(i))
0
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
62
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
63 # Print some stats on the dictionary to be used
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
64 dl = len(d)
1
00ed7df30fe4 Checking for 8k entries now. Comment improvements.
Bernhard Reiter <bernhard@intevation.de>
parents: 0
diff changeset
65 print("Found {:d} dictionary entries.".format(dl))
00ed7df30fe4 Checking for 8k entries now. Comment improvements.
Bernhard Reiter <bernhard@intevation.de>
parents: 0
diff changeset
66 if dl < 8000:
00ed7df30fe4 Checking for 8k entries now. Comment improvements.
Bernhard Reiter <bernhard@intevation.de>
parents: 0
diff changeset
67 print("!Your dictionary is below 8k entries, that is quite small!")
0
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
68 tainted = True
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
69
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
70 print("|= Number of words |= possibilities |")
4
85c65a597420 Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents: 3
diff changeset
71 for i in range(1, 5):
0
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
72 print("| {:2d} | 2^{:4.1f} |".format(
4
85c65a597420 Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents: 3
diff changeset
73 i, math.log(dl**i, 2)))
0
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
74 return d
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
75
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
76
4
85c65a597420 Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents: 3
diff changeset
77 def readDingDict(filename="/usr/share/trans/de-en", useLeft=False):
0
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
78 """Read dictionary with unique words from file in Ding format.
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
79
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
80 useLeft: Boolean to control which language to use
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
81
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
82 TODO: add option to use both languages for people that speak them both?
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
83 """
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
84
4
85c65a597420 Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents: 3
diff changeset
85 dset = set() # using the datatype 'set' to aviod duplicates
0
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
86
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
87 splitter = re.compile(r"""\ \|\ # first pattern ' | '
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
88 |;\ # second pattern '; '
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
89 |(?<=\S)/(?=\S) # 3.: '\' surrounded by chars
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
90 |\s+ # by whitespace
4
85c65a597420 Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents: 3
diff changeset
91 """, re.VERBOSE)
0
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
92
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
93 print("Reading entries from {}.".format(filename), end='')
4
85c65a597420 Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents: 3
diff changeset
94 counter = 0 # for progress or stopping early
0
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
95 with open(filename, "r") as f:
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
96 for line in f:
4
85c65a597420 Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents: 3
diff changeset
97 if line[0] == '#':
85c65a597420 Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents: 3
diff changeset
98 continue
0
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
99
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
100 # languages are separated by " :: "
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
101 p = line.partition(" :: ")
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
102 languageEntry = p[0] if useLeft else p[2]
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
103
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
104 for word in splitter.split(languageEntry):
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
105 word = word.strip('(",.)\'!:;').rstrip('/')
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
106 if len(word) > 2 and not word[0] in '[{/':
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
107 dset.add(word)
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
108
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
109 #TODO: check for very common words and remove them?
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
110
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
111 counter += 1
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
112 ## stop early when debugging
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
113 #if counter > 10: break
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
114 if not counter % 10000:
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
115 print('.', end='')
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
116 sys.stdout.flush()
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
117 print()
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
118
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
119 return list(dset)
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
120
4
85c65a597420 Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents: 3
diff changeset
121
0
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
122 def main():
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
123 global tainted
4
85c65a597420 Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents: 3
diff changeset
124
85c65a597420 Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents: 3
diff changeset
125 parser = argparse.ArgumentParser(description=__doc__.splitlines()[0])
85c65a597420 Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents: 3
diff changeset
126 parser.add_argument('--ddump-filename',
85c65a597420 Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents: 3
diff changeset
127 help='filename to dump the dictionary to')
85c65a597420 Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents: 3
diff changeset
128 options = parser.parse_args()
85c65a597420 Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents: 3
diff changeset
129
85c65a597420 Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents: 3
diff changeset
130 dictionary = buildDictionary(options)
0
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
131
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
132 howMany = 4
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
133
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
134 # use a dictionary with lower case words for a simple check if
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
135 # our random source is okay
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
136 print("\nGenerated passphrase with {} randomly selected words:\n".format(
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
137 howMany))
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
138 print(" ", end='')
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
139 words = {}
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
140 for x in range(howMany):
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
141 word = _srandom.choice(dictionary)
4
85c65a597420 Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents: 3
diff changeset
142 words[word.lower()] = True
0
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
143 print(word, end='\n ')
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
144 print("\n")
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
145
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
146 if len(words) < howMany:
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
147 print("! Your random generator is weak")
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
148 print("! or you are being very lucky.")
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
149 tainted = True
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
150
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
151 if tainted:
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
152 print("!!! Don't use the resulting passphrase !!!")
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
153
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
154 if __name__ == "__main__":
7558ecd1cbf1 Initial version.
Bernhard Reiter <bernhard@intevation.de>
parents:
diff changeset
155 main()
This site is hosted by Intevation GmbH (Datenschutzerklärung und Impressum | Privacy Policy and Imprint)