Mercurial > dive4elements > river
annotate flys-backend/contrib/shpimporter/utils.py @ 4887:1f6e544f7a7f
Importer: Use cp1252 instead of latin-9 to guess filename encodings
author | Andre Heinecke <aheinecke@intevation.de> |
---|---|
date | Mon, 28 Jan 2013 12:45:41 +0100 |
parents | b457532dae63 |
children | c0a58558b817 |
rev | line source |
---|---|
2798
5a654f2e35bc
Added a python tool to import shapefiles into database.
Ingo Weinzierl <ingo.weinzierl@intevation.de>
parents:
diff
changeset
|
1 import os |
4874
b1d7e600b43b
(importer) Add utility function to convert paths to utf-8
Andre Heinecke <aheinecke@intevation.de>
parents:
3654
diff
changeset
|
2 import sys |
3654
59ca5dab2782
Shape importer: use python's OptionParse to read user specific configuration from command line.
Ingo Weinzierl <ingo.weinzierl@intevation.de>
parents:
2798
diff
changeset
|
3 from shpimporter import DEBUG, INFO, ERROR |
2798
5a654f2e35bc
Added a python tool to import shapefiles into database.
Ingo Weinzierl <ingo.weinzierl@intevation.de>
parents:
diff
changeset
|
4 |
5a654f2e35bc
Added a python tool to import shapefiles into database.
Ingo Weinzierl <ingo.weinzierl@intevation.de>
parents:
diff
changeset
|
5 SHP='.shp' |
5a654f2e35bc
Added a python tool to import shapefiles into database.
Ingo Weinzierl <ingo.weinzierl@intevation.de>
parents:
diff
changeset
|
6 |
5a654f2e35bc
Added a python tool to import shapefiles into database.
Ingo Weinzierl <ingo.weinzierl@intevation.de>
parents:
diff
changeset
|
7 def findShapefiles(path): |
5a654f2e35bc
Added a python tool to import shapefiles into database.
Ingo Weinzierl <ingo.weinzierl@intevation.de>
parents:
diff
changeset
|
8 shapes = [] |
5a654f2e35bc
Added a python tool to import shapefiles into database.
Ingo Weinzierl <ingo.weinzierl@intevation.de>
parents:
diff
changeset
|
9 |
5a654f2e35bc
Added a python tool to import shapefiles into database.
Ingo Weinzierl <ingo.weinzierl@intevation.de>
parents:
diff
changeset
|
10 for root, dirs, files in os.walk(path): |
5a654f2e35bc
Added a python tool to import shapefiles into database.
Ingo Weinzierl <ingo.weinzierl@intevation.de>
parents:
diff
changeset
|
11 if len(files) == 0: |
5a654f2e35bc
Added a python tool to import shapefiles into database.
Ingo Weinzierl <ingo.weinzierl@intevation.de>
parents:
diff
changeset
|
12 continue |
5a654f2e35bc
Added a python tool to import shapefiles into database.
Ingo Weinzierl <ingo.weinzierl@intevation.de>
parents:
diff
changeset
|
13 |
3654
59ca5dab2782
Shape importer: use python's OptionParse to read user specific configuration from command line.
Ingo Weinzierl <ingo.weinzierl@intevation.de>
parents:
2798
diff
changeset
|
14 DEBUG("Processing directory '%s' with %i files " % (root, len(files))) |
2798
5a654f2e35bc
Added a python tool to import shapefiles into database.
Ingo Weinzierl <ingo.weinzierl@intevation.de>
parents:
diff
changeset
|
15 |
5a654f2e35bc
Added a python tool to import shapefiles into database.
Ingo Weinzierl <ingo.weinzierl@intevation.de>
parents:
diff
changeset
|
16 for f in files: |
5a654f2e35bc
Added a python tool to import shapefiles into database.
Ingo Weinzierl <ingo.weinzierl@intevation.de>
parents:
diff
changeset
|
17 idx = f.find(SHP) |
5a654f2e35bc
Added a python tool to import shapefiles into database.
Ingo Weinzierl <ingo.weinzierl@intevation.de>
parents:
diff
changeset
|
18 if (idx+len(SHP)) == len(f): |
5a654f2e35bc
Added a python tool to import shapefiles into database.
Ingo Weinzierl <ingo.weinzierl@intevation.de>
parents:
diff
changeset
|
19 shapes.append((f.replace(SHP, ''), root + "/" + f)) |
5a654f2e35bc
Added a python tool to import shapefiles into database.
Ingo Weinzierl <ingo.weinzierl@intevation.de>
parents:
diff
changeset
|
20 |
5a654f2e35bc
Added a python tool to import shapefiles into database.
Ingo Weinzierl <ingo.weinzierl@intevation.de>
parents:
diff
changeset
|
21 return shapes |
5a654f2e35bc
Added a python tool to import shapefiles into database.
Ingo Weinzierl <ingo.weinzierl@intevation.de>
parents:
diff
changeset
|
22 |
4874
b1d7e600b43b
(importer) Add utility function to convert paths to utf-8
Andre Heinecke <aheinecke@intevation.de>
parents:
3654
diff
changeset
|
23 def getUTF8Path(path): |
b1d7e600b43b
(importer) Add utility function to convert paths to utf-8
Andre Heinecke <aheinecke@intevation.de>
parents:
3654
diff
changeset
|
24 """ |
b1d7e600b43b
(importer) Add utility function to convert paths to utf-8
Andre Heinecke <aheinecke@intevation.de>
parents:
3654
diff
changeset
|
25 Tries to convert path to utf-8 by first checking the filesystemencoding |
b1d7e600b43b
(importer) Add utility function to convert paths to utf-8
Andre Heinecke <aheinecke@intevation.de>
parents:
3654
diff
changeset
|
26 and trying the default windows encoding afterwards. |
b1d7e600b43b
(importer) Add utility function to convert paths to utf-8
Andre Heinecke <aheinecke@intevation.de>
parents:
3654
diff
changeset
|
27 Returns a valid UTF-8 encoded unicode object or throws a UnicodeDecodeError |
b1d7e600b43b
(importer) Add utility function to convert paths to utf-8
Andre Heinecke <aheinecke@intevation.de>
parents:
3654
diff
changeset
|
28 """ |
b1d7e600b43b
(importer) Add utility function to convert paths to utf-8
Andre Heinecke <aheinecke@intevation.de>
parents:
3654
diff
changeset
|
29 try: |
b1d7e600b43b
(importer) Add utility function to convert paths to utf-8
Andre Heinecke <aheinecke@intevation.de>
parents:
3654
diff
changeset
|
30 return unicode.encode(unicode(path, sys.getfilesystemencoding()), "UTF-8") |
b1d7e600b43b
(importer) Add utility function to convert paths to utf-8
Andre Heinecke <aheinecke@intevation.de>
parents:
3654
diff
changeset
|
31 except UnicodeDecodeError: |
4887
1f6e544f7a7f
Importer: Use cp1252 instead of latin-9 to guess filename encodings
Andre Heinecke <aheinecke@intevation.de>
parents:
4884
diff
changeset
|
32 # Probably European Windows names so lets try again |
1f6e544f7a7f
Importer: Use cp1252 instead of latin-9 to guess filename encodings
Andre Heinecke <aheinecke@intevation.de>
parents:
4884
diff
changeset
|
33 return unicode.encode(unicode(path, "cp1252"), "UTF-8") |