transliterate package

Subpackages

Submodules

transliterate.base module

class transliterate.base.TranslitLanguagePack[source]

Bases: object

Base language pack. The attributes below shall be defined in every language pack.

language_code: Language code (obligatory). Example value: ‘hy’, ‘ru’. language_name: Language name (obligatory). Example value: ‘Armenian’,

‘Russian’.
character_ranges: Character ranges that are specific to the language.
When making a pack, check this page for the ranges.
mapping: Mapping (obligatory). A tuple, consisting of two strings
(source and target). Example value: (u’abc’, u’աբց’).
reversed_specific_mapping: Specific mapping (one direction only) used
when transliterating from target script to source script (reversed transliteration).
՝՝pre_processor_mapping՝՝: Pre processor mapping (optional). A dictionary
mapping for letters that can’t be represented by a single latin letter.
՝՝reversed_specific_pre_processor_mapping՝՝: Pre processor mapping (
optional). A dictionary mapping for letters that can’t be represented by a single latin letter (reversed transliteration).
example:
>>>    class ArmenianLanguagePack(TranslitLanguagePack):
>>>    language_code = "hy"
>>>    language_name = "Armenian"
>>>    character_ranges = ((0x0530, 0x058F), (0xFB10, 0xFB1F))
>>>    mapping = (
>>>        u"abgdezilxkhmjnpsvtrcq&ofABGDEZILXKHMJNPSVTRCQOF", # Source script
>>>        u"աբգդեզիլխկհմյնպսվտրցքևօֆԱԲԳԴԵԶԻԼԽԿՀՄՅՆՊՍՎՏՐՑՔՕՖ", # Target script
>>>    )
>>>    reversed_specific_mapping = (
>>>        u"ռՌ",
>>>        u"rR"
>>>    )
>>>    pre_processor_mapping = {
>>>        # lowercase
>>>        u"e'": u"է",
>>>        u"y": u"ը",
>>>        u"th": u"թ",
>>>        u"jh": u"ժ",
>>>        u"ts": u"ծ",
>>>        u"dz": u"ձ",
>>>        u"gh": u"ղ",
>>>        u"tch": u"ճ",
>>>        u"sh": u"շ",
>>>        u"vo": u"ո",
>>>        u"ch": u"չ",
>>>        u"dj": u"ջ",
>>>        u"ph": u"փ",
>>>        u"u": u"ու",
>>>
>>>        # uppercase
>>>        u"E'": u"Է",
>>>        u"Y": u"Ը",
>>>        u"Th": u"Թ",
>>>        u"Jh": u"Ժ",
>>>        u"Ts": u"Ծ",
>>>        u"Dz": u"Ձ",
>>>        u"Gh": u"Ղ",
>>>        u"Tch": u"Ճ",
>>>        u"Sh": u"Շ",
>>>        u"Vo": u"Ո",
>>>        u"Ch": u"Չ",
>>>        u"Dj": u"Ջ",
>>>        u"Ph": u"Փ",
>>>        u"U": u"Ու"
>>>    }
>>>    reversed_specific_pre_processor_mapping = {
>>>        u"ու": u"u",
>>>        u"Ու": u"U"
>>>    }
    Note, that in Python 3 you won't be using u prefix before the strings.
character_ranges = None
characters = None
classmethod contains(character)[source]

Checks if given character belongs to the language pack.

Return bool:
detect(text, num_words=None)[source]

Heavy language detection, which is activated for languages that are harder detect (like Russian Cyrillic and Ukrainian Cyrillic).

Parameters:
  • value (unicode) – Input string.
  • num_words (int) – Number of words to base decision on.
Return bool:

True if detected and False otherwise.

detectable = False
language_code = None
language_name = None
make_strict(value, reversed=False)[source]

Strips out unnecessary characters from the string.

Parameters:
  • value (string) –
  • reversed (bool) –
Return string:
mapping = None
pre_processor_mapping = None
pre_processor_mapping_keys = []
reversed_characters = None
reversed_pre_processor_mapping_keys = []
reversed_specific_mapping = None
reversed_specific_pre_processor_mapping = None
reversed_specific_pre_processor_mapping_keys = []
suggest(value, reversed=False, limit=None)[source]

Suggest possible variants (some sort of auto-complete).

Parameters:
  • value (str) –
  • limit (int) – Limit number of suggested variants.
Return list:
translit(value, reversed=False, strict=False, fail_silently=True)[source]

Transliterates the given value according to the rules set in the transliteration pack.

Parameters:
  • value (str) –
  • reversed (bool) –
  • strict (bool) –
Return str:

transliterate.conf module

transliterate.decorators module

transliterate.decorators.transliterate_function

alias of TransliterateFunction

transliterate.decorators.transliterate_method

alias of TransliterateMethod

transliterate.defaults module

transliterate.discover module

transliterate.discover.autodiscover()[source]

Auto-discovers the language packs in contrib/apps.

transliterate.exceptions module

exception transliterate.exceptions.LanguageCodeError[source]

Bases: exceptions.Exception

Exception raised when language code is left empty or has incorrect value.

exception transliterate.exceptions.ImproperlyConfigured[source]

Bases: exceptions.Exception

Exception raised when developer didn’t configure the code properly.

exception transliterate.exceptions.LanguagePackNotFound[source]

Bases: exceptions.Exception

Exception raised when language pack is not found for the language code given.

exception transliterate.exceptions.LanguageDetectionError[source]

Bases: exceptions.Exception

Exception raised when language can’t be detected for the text given.

exception transliterate.exceptions.InvalidRegistryItemType[source]

Bases: exceptions.ValueError

Raised when an attempt is made to register an item in the registry which does not have a proper type.

transliterate.helpers module

transliterate.helpers.PROJECT_DIR(base)

transliterate.utils module

transliterate.utils.translit(value, language_code=None, reversed=False, strict=False)[source]

Transliterates the text for the language given. Language code is optional in case of reversed translations (from some script to latin).

Parameters:
  • value (str) –
  • language_code (str) –
  • reversed (bool) – If set to True, reversed translation is made.
  • strict (bool) – If given, all that are not found in the transliteration pack, are simply stripped out.
Return str:
transliterate.utils.suggest(value, language_code=None, reversed=False, limit=None)[source]

Suggest possible variants.

Parameters:
  • value (str) –
  • language_code (str) –
  • reversed (bool) – If set to True, reversed translation is made.
  • limit (int) – Limit number of suggested variants.
Return list:
transliterate.utils.detect_language(text, num_words=None, fail_silently=True, heavy_check=False)[source]

Detects the language from the value given based on ranges defined in active language packs.

Parameters:
  • value (unicode) – Input string.
  • num_words (int) – Number of words to base decision on.
  • fail_silently (bool) –
  • heavy_check (bool) – If given, heavy checks would be applied when simple checks don’t give any results. Heavy checks are language specific and do not apply to a common logic. Heavy language detection is defined in the detect method of each language pack.
Return str:

Language code.

transliterate.utils.slugify(text, language_code=None)[source]

Slugifies the given text. If no language_code is given, auto-detects the language code from text given.

Parameters:text (unicode) –
Return str:

Module contents

transliterate.translit(value, language_code=None, reversed=False, strict=False)[source]

Transliterates the text for the language given. Language code is optional in case of reversed translations (from some script to latin).

Parameters:
  • value (str) –
  • language_code (str) –
  • reversed (bool) – If set to True, reversed translation is made.
  • strict (bool) – If given, all that are not found in the transliteration pack, are simply stripped out.
Return str:
transliterate.get_available_language_codes()[source]

Gets list of language codes for registered language packs.

Return list:
transliterate.detect_language(text, num_words=None, fail_silently=True, heavy_check=False)[source]

Detects the language from the value given based on ranges defined in active language packs.

Parameters:
  • value (unicode) – Input string.
  • num_words (int) – Number of words to base decision on.
  • fail_silently (bool) –
  • heavy_check (bool) – If given, heavy checks would be applied when simple checks don’t give any results. Heavy checks are language specific and do not apply to a common logic. Heavy language detection is defined in the detect method of each language pack.
Return str:

Language code.

transliterate.slugify(text, language_code=None)[source]

Slugifies the given text. If no language_code is given, auto-detects the language code from text given.

Parameters:text (unicode) –
Return str:
transliterate.get_available_language_packs()[source]

Gets list of registered language packs.

Return list: