transliterate package

Subpackages

Submodules

transliterate.base module

class transliterate.base.TranslitLanguagePack[source]

Bases: object

Base language pack.

The attributes below shall be defined in every language pack.

language_code: Language code (obligatory). Example value: ‘hy’, ‘ru’. language_name: Language name (obligatory). Example value: ‘Armenian’,

‘Russian’.
character_ranges: Character ranges that are specific to the language.
When making a pack, check this page for the ranges.
mapping: Mapping (obligatory). A tuple, consisting of two strings
(source and target). Example value: (u’abc’, u’աբց’).
reversed_specific_mapping: Specific mapping (one direction only) used
when transliterating from target script to source script (reversed transliteration).
՝՝pre_processor_mapping՝՝: Pre processor mapping (optional). A dictionary
mapping for letters that can’t be represented by a single latin letter.
՝՝reversed_specific_pre_processor_mapping՝՝: Pre processor mapping (
optional). A dictionary mapping for letters that can’t be represented by a single latin letter (reversed transliteration).
example:
>>>    class ArmenianLanguagePack(TranslitLanguagePack):
>>>    language_code = "hy"
>>>    language_name = "Armenian"
>>>    character_ranges = ((0x0530, 0x058F), (0xFB10, 0xFB1F))
>>>    mapping = (
>>>        u"abgdezilxkhmjnpsvtrcq&ofABGDEZILXKHMJNPSVTRCQOF", # Source script
>>>        u"աբգդեզիլխկհմյնպսվտրցքևօֆԱԲԳԴԵԶԻԼԽԿՀՄՅՆՊՍՎՏՐՑՔՕՖ", # Target script
>>>    )
>>>    reversed_specific_mapping = (
>>>        u"ռՌ",
>>>        u"rR"
>>>    )
>>>    pre_processor_mapping = {
>>>        # lowercase
>>>        u"e'": u"է",
>>>        u"y": u"ը",
>>>        u"th": u"թ",
>>>        u"jh": u"ժ",
>>>        u"ts": u"ծ",
>>>        u"dz": u"ձ",
>>>        u"gh": u"ղ",
>>>        u"tch": u"ճ",
>>>        u"sh": u"շ",
>>>        u"vo": u"ո",
>>>        u"ch": u"չ",
>>>        u"dj": u"ջ",
>>>        u"ph": u"փ",
>>>        u"u": u"ու",
>>>
>>>        # uppercase
>>>        u"E'": u"Է",
>>>        u"Y": u"Ը",
>>>        u"Th": u"Թ",
>>>        u"Jh": u"Ժ",
>>>        u"Ts": u"Ծ",
>>>        u"Dz": u"Ձ",
>>>        u"Gh": u"Ղ",
>>>        u"Tch": u"Ճ",
>>>        u"Sh": u"Շ",
>>>        u"Vo": u"Ո",
>>>        u"Ch": u"Չ",
>>>        u"Dj": u"Ջ",
>>>        u"Ph": u"Փ",
>>>        u"U": u"Ու"
>>>    }
>>>    reversed_specific_pre_processor_mapping = {
>>>        u"ու": u"u",
>>>        u"Ու": u"U"
>>>    }
    Note, that in Python 3 you won't be using u prefix before the strings.
character_ranges = None
characters = None
classmethod contains(character)[source]

Check if given character belongs to the language pack.

Return bool:
classmethod detect(num_words=None)[source]

Detect the language.

Heavy language detection, which is activated for languages that are harder detect (like Russian Cyrillic and Ukrainian Cyrillic).

Parameters:
  • value (unicode) – Input string.
  • num_words (int) – Number of words to base decision on.
Return bool:

True if detected and False otherwise.

detectable = False
language_code = None
language_name = None
make_strict(value, reversed=False)[source]

Strip out unnecessary characters from the string.

Parameters:
  • value (string) –
  • reversed (bool) –
Return string:
mapping = None
pre_processor_mapping = None
pre_processor_mapping_keys = []
reversed_characters = None
reversed_pre_processor_mapping_keys = []
reversed_specific_mapping = None
reversed_specific_pre_processor_mapping = None
reversed_specific_pre_processor_mapping_keys = []
classmethod suggest(reversed=False, limit=None)[source]

Suggest possible variants (some sort of auto-complete).

Parameters:
  • value (str) –
  • limit (int) – Limit number of suggested variants.
Return list:
translit(value, reversed=False, strict=False, fail_silently=True)[source]

Transliterate the given value according to the rules.

Rules are set in the transliteration pack.

Parameters:
  • value (str) –
  • reversed (bool) –
  • strict (bool) –
  • fail_silently (bool) –
Return str:

transliterate.conf module

transliterate.decorators module

transliterate.decorators.transliterate_function

alias of transliterate.decorators.TransliterateFunction

transliterate.decorators.transliterate_method

alias of transliterate.decorators.TransliterateMethod

transliterate.defaults module

transliterate.discover module

transliterate.discover.autodiscover()[source]

Auto-discover the language packs in contrib/apps.

transliterate.exceptions module

exception transliterate.exceptions.ImproperlyConfigured[source]

Bases: exceptions.Exception

Exception raised when developer didn’t configure the code properly.

exception transliterate.exceptions.InvalidRegistryItemType[source]

Bases: exceptions.ValueError

Raised when an attempt is made to register an item in the registry.

Raised when an attempt is made to register an item in the registry which does not have a proper type.

exception transliterate.exceptions.LanguageCodeError[source]

Bases: exceptions.Exception

Exception raised when language code is empty or has incorrect value.

exception transliterate.exceptions.LanguageDetectionError[source]

Bases: exceptions.Exception

Exception raised when language can’t be detected.

Exception raised when language can’t be detected for the text given.

exception transliterate.exceptions.LanguagePackNotFound[source]

Bases: exceptions.Exception

Exception raised when language pack is not found.

Exception raised when language pack is not found for the language code given.

transliterate.helpers module

transliterate.helpers.PROJECT_DIR(base)

Project dir.

transliterate.helpers.project_dir(base)[source]

Project dir.

transliterate.utils module

transliterate.utils.detect_language(text, num_words=None, fail_silently=True, heavy_check=False)[source]

Detect the language from the value given.

Detect the language from the value given based on ranges defined in active language packs.

Parameters:
  • value (unicode) – Input string.
  • num_words (int) – Number of words to base decision on.
  • fail_silently (bool) –
  • heavy_check (bool) – If given, heavy checks would be applied when simple checks don’t give any results. Heavy checks are language specific and do not apply to a common logic. Heavy language detection is defined in the detect method of each language pack.
Return str:

Language code.

transliterate.utils.get_available_language_codes()[source]

Get list of language codes for registered language packs.

Return list:
transliterate.utils.get_available_language_packs()[source]

Get list of registered language packs.

Return list:
transliterate.utils.get_translit_function(language_code)[source]

Return translit function for the language given.

Parameters:language_code (str) –
Return callable:
 
transliterate.utils.slugify(text, language_code=None)[source]

Slugify the given text.

If no language_code is given, auto-detect the language code from text given.

Parameters:
  • text (str) –
  • language_code (str) –
Return str:
transliterate.utils.suggest(value, language_code=None, reversed=False, limit=None)[source]

Suggest possible variants.

Parameters:
  • value (str) –
  • language_code (str) –
  • reversed (bool) – If set to True, reversed translation is made.
  • limit (int) – Limit number of suggested variants.
Return list:
transliterate.utils.translit(value, language_code=None, reversed=False, strict=False)[source]

Transliterate the text for the language given.

Language code is optional in case of reversed translations (from some script to latin).

Parameters:
  • value (str) –
  • language_code (str) –
  • reversed (bool) – If set to True, reversed translation is made.
  • strict (bool) – If given, all that are not found in the transliteration pack, are simply stripped out.
Return str:

Module contents

transliterate.detect_language(text, num_words=None, fail_silently=True, heavy_check=False)[source]

Detect the language from the value given.

Detect the language from the value given based on ranges defined in active language packs.

Parameters:
  • value (unicode) – Input string.
  • num_words (int) – Number of words to base decision on.
  • fail_silently (bool) –
  • heavy_check (bool) – If given, heavy checks would be applied when simple checks don’t give any results. Heavy checks are language specific and do not apply to a common logic. Heavy language detection is defined in the detect method of each language pack.
Return str:

Language code.

transliterate.get_available_language_codes()[source]

Get list of language codes for registered language packs.

Return list:
transliterate.get_available_language_packs()[source]

Get list of registered language packs.

Return list:
transliterate.get_translit_function(language_code)[source]

Return translit function for the language given.

Parameters:language_code (str) –
Return callable:
 
transliterate.slugify(text, language_code=None)[source]

Slugify the given text.

If no language_code is given, auto-detect the language code from text given.

Parameters:
  • text (str) –
  • language_code (str) –
Return str:
transliterate.translit(value, language_code=None, reversed=False, strict=False)[source]

Transliterate the text for the language given.

Language code is optional in case of reversed translations (from some script to latin).

Parameters:
  • value (str) –
  • language_code (str) –
  • reversed (bool) – If set to True, reversed translation is made.
  • strict (bool) – If given, all that are not found in the transliteration pack, are simply stripped out.
Return str: