Use unidecode to improve image filenames (fix#136)

Image filenames containing non ascii characters would be translated to a
series of underscores (____.png). To fix this, we use the unidecoe library
(which we also add to the required packages for Wagtail) which translates
each unicode character to an ascii equivalent.

For more info on how unidecode works please check @Evgeny's answer at this
question:

http://stackoverflow.com/questions/702337/how-to-make-django-slugify-work-properly-with-unicode-strings
pull/139/head
Serafeim Papastefanos 2014-03-10 17:17:57 +02:00
rodzic e50b0fc0bb
commit 24b0712fc1
2 zmienionych plików z 5 dodań i 1 usunięć

Wyświetl plik

@ -48,6 +48,7 @@ setup(
"Pillow>=2.3.0",
"beautifulsoup4>=4.3.2",
"lxml>=3.3.0",
'Unidecode>=0.04.14',
"BeautifulSoup==3.2.1", # django-compressor gets confused if we have lxml but not BS3 installed
],
zip_safe=False,

Wyświetl plik

@ -14,6 +14,8 @@ from django.utils.html import escape
from django.conf import settings
from django.utils.translation import ugettext_lazy as _
from unidecode import unidecode
from wagtail.wagtailadmin.taggable import TagSearchable
from wagtail.wagtailimages import image_ops
@ -25,8 +27,9 @@ class AbstractImage(models.Model, TagSearchable):
folder_name = 'original_images'
filename = self.file.field.storage.get_valid_name(filename)
# do a unidecode in the filename and then
# replace non-ascii characters in filename with _ , to sidestep issues with filesystem encoding
filename = "".join((i if ord(i) < 128 else '_') for i in filename)
filename = "".join((i if ord(i) < 128 else '_') for i in unidecode(filename))
while len(os.path.join(folder_name, filename)) >= 95:
prefix, dot, extension = filename.rpartition('.')