Регулярно требуется преобразовать какой-либо текст в максимально совместимый текст для URL, имени файла, имени объекта в каком-то софте и тд. Требования совместимости простые: в тексте должны быть только допустимые символы. Обычно это a-z, 0-9 и "_" или "-". То есть, только прописные буквы латинского алфавита и цифры (как пример).
Допустим, нам нужно название статьи в блоге преобразовать в slug для добавления его в URL этой статьи. Как это лучше всего сделать?
В Django по умолчанию есть готовая функция slugify для таких случаев.
Но я её никогда не использую. Почему? Потому что её недостаточно!
Приведём пример
>>> from django.utils.text import slugify
>>> slugify('This is a Title')
'this-is-a-title'
Пока всё отлично
>>> slugify('This is a "Title!"')
'this-is-a-title'
Спец символы удалились, всё хорошо.
>>> slugify('Это заголовок статьи')
''
Вот и приехали 😢. Если текст не английский то буквы просто игнорируются. Можно это поправить
>>> slugify('Это заголовок статьи', allow_unicode=True)
'это-заголовок-статьи'
Но тогда мы не вписываемся в условие. У нас появилась кириллица в тексте.
Так как я часто пишу сайты для русскоязычных пользователей эта проблема весьма актуальна. Я не использую стандартную функцию и всегда пишу свою.
Оригинал я не беру в расчёт и пишу полностью свою функцию. И так, по порядку:
🔸1. Исходный текст:
>>> text = 'Мой заголовок №10 😁!'
Взял специально посложней со специальными символами.
🔸2. Транслит
Необходимо сделать транслит всех символов в латиницу. Здесь очень выручает библиотека unidecode. Помимо простого транслита кириллицы в латиницу она умеет преобразовывать спец символы и иероглифы в текстовые аналоги.
from unidecode import unidecode
>>> unidecode("Ñ Σ ® µ ¶ ¼ 月 山")
'N S (r) u P 1/4 Yue Shan'
Очень крутая библиотека, советую👍
В нашем случае получаем такое преобразование:
>>> text = unidecode(text)
>>> print(text)
'Moi zagolovok No. 10 !'
Отличный транслит. Смайл просто удалился, хотя я ждал что-то вроде :). Ну и ладно, всë равно невалидные символы.
А еще наш код уже поддерживает любой язык, будь то хинди или корейский.
🔸4. Фильтр символов
Unidecode не занимается фильтрацией по недопустимым символам. Это мы делаем в следующем шаге через regex. Просто заменим все символы на "_" если они вне указанного диапазона.
>>> text = re.sub(r'[^a-zA-Z0-9]+', '_', text)
>>> print(text)
'Moi_zagolovok_No_10_'
Символ "+" в паттерне выручает когда несколько недопустимых символов идут рядом. Все они заменяются на один символ "_".
🔸5. Slugify
Осталось удалить лишние символы по краям и сделать нижний регистр
>>> text = text.strip('_').lower()
>>> print(text)
'moi_zagolovok_no_10'
Получаем отличный slug! 😎
🌎 Полный код в виде функции.
______________
PS. Проверку что в строке остался хоть один допустимый символ я бы вынес в отдельную функцию.
#libs#tricks#django
🇷🇺🇮🇩 On May 27 in Jakarta, the Russian-Indonesian consultations on biological safety and security in an inter-agency format took place.
A review of the current situation in the field of biological safety and security at global and regional level was carried out. The issues of developing bilateral cooperation in the field of biological safety and security and strengthening of the Biological and Toxin Weapons Convention (#BTWC) regime were discussed.
The Sides exchanged information on respective national efforts in addressing the biological safety and security. The Sides noted the need for further close coordination and constructive interaction both in bilateral format and at relevant multilateral fora, primarily within the framework of the BTWC and the UN.
#RussiaIndonesia
⚡️Comment by Foreign Ministry Spokeswoman Maria Zakharova on military biological activity in Ukrainian biological laboratories.
💬 We confirm that, during the special military operation in Ukraine, the Kiev regime was found to have been concealing traces of a military biological programme implemented with funding from the United States Department of Defence.
Documentation on the urgent eradication of highly hazardous pathogens of plague, anthrax, rabbit-fever, cholera and other lethal diseases on February 24 was received from employees of Ukrainian biolaboratories. This included an instruction from the Ministry of Health of Ukraine on the urgent eradication of stored reserves of highly hazardous pathogens sent to all biolaboratories. These materials can be found on the internet portal of the Ministry of Defence of the Russian Federation.
This documentation is now being thoroughly analysed by specialists of nuclear, biological and chemical protection troops.
❗️ However, even at this point, we can conclude that components of biological weapons were being developed in Ukrainian laboratories in direct proximity to Russian territory.
The urgent eradication of highly hazardous pathogens on February 24 was ordered to prevent exposing a violation of Article I of the Biological and Toxin Weapons Convention (#BTWC) by Ukraine and the United States.
👉 This information proves that the claims we have repeatedly made with regard to the military biological activity of the United States and their allies in the post-Soviet space within the framework of the BTWC were justified. We cannot rule out using the mechanisms in Articles V and VI of the BTWC, pursuant to which the member states shall consult with each other to resolve any issues regarding the purpose of the Convention or the execution of its provisions, and cooperate in any investigation of possible violations of the obligations under the BTWC.
Decisive actions to strengthen the BTWC regime are required to prevent any military biological activity carried out in violation of the BTWC.
• We call for the resumption of work on a legally binding Protocol to the Convention with an effective verification mechanism, something the United States has been blocking since 2001. In this context, we call for the creation of an open-ended group as part of the BTWC, which is in the interests of the overwhelming majority of member states.
• To strengthen the institutional foundation of the Convention we promote initiatives, which have wide international support, to create mobile medical and biological units within the framework of the BTWC (to render assistance if biological weapons are being used and to fight outbreaks of various origins) and to establish a Research Advisory Committee (to analyse scientific and technical achievements and to provide the states with appropriate recommendations).
• In addition, we suggest including information on military biological activities carried out abroad in annual reports provided by the BTWC member states as part of confidence-building measures.
☝️Only comprehensive steps like these will make it possible to place the military biological activity of the #UnitedStates and their allies in the post-Soviet space, as well as other regions of the world, under close international control and ensure the verifiable compliance of the BTWC member states with their obligations.
🇷🇺🇰🇭 On June 10-11, Sochi hosted the 5th International Conference "Global Threats to Biological Security: Problems and Solutions", which was attended by representatives of the Ministry of Environment of the Kingdom of Cambodia.
☣️ The event, organized by Rospotrebnadzor jointly with the Ministry of Foreign Affairs of the Russian Federation, is focused on considering topical problems of biological security and seeking their solutions.
🤝 The participants of the Conference discussed international cooperation and exchanged assessments of national experience in countering biological threats. Scientific and technical developments in the context of ensuring biological security were considered. A thorough exchange of views on multilateral collaboration, readiness, response and assistance within the framework of the Biological and Toxin Weapons Convention (#BTWC) took place. The issue of strengthening the Convention was discussed in detail.
Deputy Minister of Foreign Affairs of the Russian Federation H.E. Mr Sergey Ryabkov made a report on the problems of biological security and prospects for international cooperation in this area.
#BioSecurity
#RussiaCambodia
🇷🇺🇰🇭 On June 4, in Phnom Penh the Russian-Cambodian consultations on biological security in an inter-agency format took place.
An exchange of biological security threat assessments in Asia-Pacific Region as well as in the world was carried out. The issues of developing bilateral cooperation in the field of biosecurity and strengthening of the Biological and Toxin Weapons Convention (#BTWC) regime were discussed.
The Sides discussed ways to strengthen the unity of approaches of Russia and Cambodia to the biological security. The Sides noted the need for further close coordination and constructive interaction both in bilateral format and at relevant multilateral fora, primarily within the framework of the BTWC and the UN.
#RussiaCambodia#BioSecurity
🇷🇺🇰🇭 On June, 4 in Phnom Penh the Russian-Cambodian consultations on biological security in an inter-agency format took place.
An exchange of biological security threat assessments in Asia-Pacific Region as well as in the world was carried out. The issues of developing bilateral cooperation in the field of biosecurity and strengthening of the Biological and Toxin Weapons Convention (#BTWC) regime were discussed.
The Sides discussed ways to strengthen the unity of approaches of Russia and Cambodia to the biological security. The Sides noted the need for further close coordination and constructive interaction both in bilateral format and at relevant multilateral fora, primarily within the framework of the BTWC and the UN.
#RussiaCambodia#BioSecurity
#BioSecurity
🇷🇺🇲🇲 On 29 January, 2026, Russian-Myanmar consultations on biological security in interagency format took place in Naypyidaw.
The Russian delegation was headed by Konstantin Vorontsov, Deputy Director of the Department for Nonproliferation and Arms Control of the Ministry of Foreign Affairs of the Russian Federation, and the Myanmar delegation was headed by Aung Zay Ya, Deputy Minister of Science and Technology of the Republic of the Union of #Myanmar.
The exchange of assessments concerning threats in the field of biological security in Asia-Pacific Region as well as in the world was conducted.
The issues of developing bilateral cooperation in the field of biological security and strengthening the Biological and Toxin Weapons Convention (#BTWC) regime were discussed.
The meeting confirmed the unity of approaches between the Russian Federation and Myanmar on the issues of biological security.
🤝 The need for further close coordination and constructive interaction both in bilateral format, as well as at relevant multilateral fora, primarily within the framework of the BTWC and the UN, was noted.
#RussiaMyanmar
#BioSecurity
🇷🇺🇲🇲 On 29 January, 2026, Russian-Myanmar consultations on biological security in interagency format took place in Naypyidaw.
The Russian delegation was headed by Konstantin Vorontsov, Deputy Director of the Department for Nonproliferation and Arms Control of the Ministry of Foreign Affairs of the Russian Federation, and the Myanmar delegation was headed by Aung Zay Ya, Deputy Minister of Science and Technology of the Republic of the Union of #Myanmar.
The exchange of assessments concerning threats in the field of biological security in Asia-Pacific Region as well as in the world was conducted.
The issues of developing bilateral cooperation in the field of biological security and strengthening the Biological and Toxin Weapons Convention (#BTWC) regime were discussed.
The meeting confirmed the unity of approaches between the Russian Federation and Myanmar on the issues of biological security.
🤝 The need for further close coordination and constructive interaction both in bilateral format, as well as at relevant multilateral fora, primarily within the framework of the BTWC and the UN, was noted.
#RussiaMyanmar