Mercurial > ppgen
annotate ppgen.py @ 8:200c2c3c5f67
Adds comfort.
* Adds options
** '--just-passphrase': to get a passphrase on one line for use in skripts.
** '--number-of-words': the passphrase shall consists of
* Refactors to implement the options and preparing for future developments:
** Warnings will be written to stderr and tained will bail out with sys.exit().
** The output_string is build before printing.
author | Bernhard Reiter <bernhard@intevation.de> |
---|---|
date | Thu, 18 Jan 2018 08:44:33 +0100 |
parents | 8b2f8f439817 |
children | 35c468a37b54 |
rev | line source |
---|---|
0 | 1 #!/usr/bin/env python3 |
4
85c65a597420
Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents:
3
diff
changeset
|
2 """Create a random passphrase from a dictionary of words. BETA |
0 | 3 |
4 Relies on the entropy of python's | |
5 random.SystemRandom class | |
6 which (according to the documentation) calls os.urandom() | |
7 which (according to the documentation) calls the operating system | |
4
85c65a597420
Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents:
3
diff
changeset
|
8 specific randomness source which "should be unpredictable |
0 | 9 enough for cryptographic applications" |
10 | |
11 Requires: | |
12 * Python v>=3.2 | |
13 * a dictionary, Ding's trans-de-en by default. | |
14 E.g. on a Debian/Ubuntu system in package "trans-de-en". | |
15 or from http://ftp.tu-chemnitz.de/pub/Local/urz/ding/de-en/ | |
16 | |
1
00ed7df30fe4
Checking for 8k entries now. Comment improvements.
Bernhard Reiter <bernhard@intevation.de>
parents:
0
diff
changeset
|
17 Uses a hardcoded filepath and language. |
0 | 18 Search for **customize** below to change it. |
19 | |
3
757625ec8364
Comment added hint about SLT's Go implementation.
Bernhard Reiter <bernhard@intevation.de>
parents:
2
diff
changeset
|
20 Related: There is a Go implementation started by Sascha L. Teichmann at |
757625ec8364
Comment added hint about SLT's Go implementation.
Bernhard Reiter <bernhard@intevation.de>
parents:
2
diff
changeset
|
21 https://bitbucket.org/s_l_teichmann/ppgen |
757625ec8364
Comment added hint about SLT's Go implementation.
Bernhard Reiter <bernhard@intevation.de>
parents:
2
diff
changeset
|
22 |
757625ec8364
Comment added hint about SLT's Go implementation.
Bernhard Reiter <bernhard@intevation.de>
parents:
2
diff
changeset
|
23 |
8 | 24 Copyright 2016, 2017, 2018 by Intevation GmbH. |
1
00ed7df30fe4
Checking for 8k entries now. Comment improvements.
Bernhard Reiter <bernhard@intevation.de>
parents:
0
diff
changeset
|
25 Author: Bernhard E. Reiter <bernhard@intevation.de> |
0 | 26 |
27 This file is Free Software under the Apache 2.0 license and thus | |
28 comes without any warranty (to extend permissible under applicable law). | |
29 """ | |
30 | |
4
85c65a597420
Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents:
3
diff
changeset
|
31 import argparse |
0 | 32 import math |
33 import re | |
34 import sys | |
35 | |
36 from random import SystemRandom | |
37 _srandom = SystemRandom() | |
38 | |
39 tainted = False # to be set if we find a hint that the passphrase may be weak | |
40 | |
4
85c65a597420
Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents:
3
diff
changeset
|
41 |
85c65a597420
Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents:
3
diff
changeset
|
42 def buildDictionary(options): |
0 | 43 """Build up a dictionary of unique words, calculate stats.""" |
44 global tainted | |
45 d = [] | |
46 | |
47 # dictionary for testing | |
4
85c65a597420
Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents:
3
diff
changeset
|
48 #d = ["abc", "aBc", "cde", "efg", "hij", "blubber", |
85c65a597420
Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents:
3
diff
changeset
|
49 # "jikf", "zug", "lmf", "opq"] |
2
a099246680ae
Fix for the unique test.
Bernhard Reiter <bernhard@intevation.de>
parents:
1
diff
changeset
|
50 # second test dictionary to show that different string functions are used. |
a099246680ae
Fix for the unique test.
Bernhard Reiter <bernhard@intevation.de>
parents:
1
diff
changeset
|
51 #d = [''.join('A' * 1000) for _ in range(1000)] |
0 | 52 |
53 # Using the dictionary from Ding **customize** | |
8 | 54 d = readDingDict(options, filename="/usr/share/trans/de-en", useLeft=True) |
0 | 55 |
8 | 56 # for debugging purposes, dump dictionary |
4
85c65a597420
Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents:
3
diff
changeset
|
57 if options.ddump_filename: |
85c65a597420
Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents:
3
diff
changeset
|
58 print("Writing out dictionary in '{}'.".format(options.ddump_filename)) |
85c65a597420
Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents:
3
diff
changeset
|
59 with open(options.ddump_filename, "w") as f: |
85c65a597420
Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents:
3
diff
changeset
|
60 for i in d: |
85c65a597420
Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents:
3
diff
changeset
|
61 f.write("{}\n".format(i)) |
0 | 62 |
63 # Print some stats on the dictionary to be used | |
64 dl = len(d) | |
8 | 65 if not options.just_passphrase: |
66 print("Found {:d} dictionary entries.".format(dl)) | |
67 print("|= Number of words |= possibilities |") | |
68 for i in range(1, 5): | |
69 print("| {:2d} | 2^{:4.1f} |".format( | |
70 i, math.log(dl**i, 2))) | |
71 | |
1
00ed7df30fe4
Checking for 8k entries now. Comment improvements.
Bernhard Reiter <bernhard@intevation.de>
parents:
0
diff
changeset
|
72 if dl < 8000: |
8 | 73 sys.stderr.write("!Your dictionary is below 8k entries, " |
74 "that is quite small!\n") | |
0 | 75 tainted = True |
76 return d | |
77 | |
78 | |
8 | 79 def readDingDict(options, filename="/usr/share/trans/de-en", useLeft=False): |
0 | 80 """Read dictionary with unique words from file in Ding format. |
81 | |
82 useLeft: Boolean to control which language to use | |
83 | |
84 TODO: add option to use both languages for people that speak them both? | |
85 """ | |
86 | |
6
81f75c9aac84
Cleanup, minor: improves Comments. Bumps copyright.
Bernhard Reiter <bernhard@intevation.de>
parents:
5
diff
changeset
|
87 dset = set() # using the datatype 'set' to avoid duplicates |
0 | 88 |
89 splitter = re.compile(r"""\ \|\ # first pattern ' | ' | |
90 |;\ # second pattern '; ' | |
6
81f75c9aac84
Cleanup, minor: improves Comments. Bumps copyright.
Bernhard Reiter <bernhard@intevation.de>
parents:
5
diff
changeset
|
91 |(?<=\S)/(?=\S) # 3.: '/' surrounded by chars |
0 | 92 |\s+ # by whitespace |
4
85c65a597420
Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents:
3
diff
changeset
|
93 """, re.VERBOSE) |
0 | 94 |
8 | 95 if not options.just_passphrase: |
96 print("Reading entries from {}.".format(filename), end='') | |
4
85c65a597420
Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents:
3
diff
changeset
|
97 counter = 0 # for progress or stopping early |
0 | 98 with open(filename, "r") as f: |
99 for line in f: | |
4
85c65a597420
Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents:
3
diff
changeset
|
100 if line[0] == '#': |
85c65a597420
Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents:
3
diff
changeset
|
101 continue |
0 | 102 |
103 # languages are separated by " :: " | |
104 p = line.partition(" :: ") | |
105 languageEntry = p[0] if useLeft else p[2] | |
106 | |
107 for word in splitter.split(languageEntry): | |
7
8b2f8f439817
Improves: ding parser.
Bernhard Reiter <bernhard@intevation.de>
parents:
6
diff
changeset
|
108 word = word.strip('(",.)\'!:;<>').rstrip('/') |
0 | 109 if len(word) > 2 and not word[0] in '[{/': |
110 dset.add(word) | |
111 | |
112 #TODO: check for very common words and remove them? | |
113 | |
114 counter += 1 | |
115 ## stop early when debugging | |
116 #if counter > 10: break | |
8 | 117 if not options.just_passphrase and counter % 10000 == 0: |
0 | 118 print('.', end='') |
119 sys.stdout.flush() | |
8 | 120 if not options.just_passphrase: |
121 print() | |
0 | 122 |
123 return list(dset) | |
124 | |
4
85c65a597420
Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents:
3
diff
changeset
|
125 |
0 | 126 def main(): |
127 global tainted | |
4
85c65a597420
Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents:
3
diff
changeset
|
128 |
8 | 129 parser = argparse.ArgumentParser( |
130 description=__doc__.splitlines()[0], | |
131 formatter_class=argparse.ArgumentDefaultsHelpFormatter) | |
132 parser.add_argument('-n', '--number-of-words', type=int, default=4, | |
133 help='how many words to draw for the passphrase, ' | |
134 'most useful with -j') | |
135 parser.add_argument('-j', '--just-passphrase', action="store_true", | |
136 help='only output the passphrase on a single line') | |
4
85c65a597420
Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents:
3
diff
changeset
|
137 parser.add_argument('--ddump-filename', |
85c65a597420
Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents:
3
diff
changeset
|
138 help='filename to dump the dictionary to') |
85c65a597420
Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents:
3
diff
changeset
|
139 options = parser.parse_args() |
85c65a597420
Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents:
3
diff
changeset
|
140 |
85c65a597420
Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents:
3
diff
changeset
|
141 dictionary = buildDictionary(options) |
0 | 142 |
8 | 143 how_many = options.number_of_words |
0 | 144 |
8 | 145 output_string = "" |
146 if not options.just_passphrase: | |
147 print("\nGenerated passphrase with {}" | |
148 " randomly selected words:\n".format(how_many)) | |
149 print(" ", end='') | |
150 separator = '\n ' | |
151 else: | |
152 separator = ' ' | |
153 | |
154 # use a dictionary `words` with lower cased words for a rudimentary check | |
0 | 155 words = {} |
8 | 156 for x in range(how_many): |
0 | 157 word = _srandom.choice(dictionary) |
4
85c65a597420
Improves: command line options and code style.
Bernhard Reiter <bernhard@intevation.de>
parents:
3
diff
changeset
|
158 words[word.lower()] = True |
8 | 159 output_string += word + separator |
0 | 160 |
8 | 161 print(output_string) |
162 | |
163 if len(words) < how_many: | |
164 sys.stderr.write("! You've drawn a word more than once, this means:\n" | |
165 "! Your random generation is weak" | |
166 " or you are being very lucky.\n") | |
0 | 167 tainted = True |
168 | |
169 if tainted: | |
8 | 170 sys.exit("!!! Don't use the resulting passphrase !!!") |
0 | 171 |
172 if __name__ == "__main__": | |
173 main() |