Најди сличен содржај

Изворен канал @pythonotes · Post #121 · 20 јул.

Регулярно требуется преобразовать какой-либо текст в максимально совместимый текст для URL, имени файла, имени объекта в каком-то софте и тд. Требования совместимости простые: в тексте должны быть только допустимые символы. Обычно это a-z, 0-9 и "_" или "-". То есть, только прописные буквы латинского алфавита и цифры (как пример). Допустим, нам нужно название статьи в блоге преобразовать в slug для добавления его в URL этой статьи. Как это лучше всего сделать? В Django по умолчанию есть готовая функция slugify для таких случаев. Но я её никогда не использую. Почему? Потому что её недостаточно! Приведём пример >>> from django.utils.text import slugify >>> slugify('This is a Title') 'this-is-a-title' Пока всё отлично >>> slugify('This is a "Title!"') 'this-is-a-title' Спец символы удалились, всё хорошо. >>> slugify('Это заголовок статьи') '' Вот и приехали 😢. Если текст не английский то буквы просто игнорируются. Можно это поправить >>> slugify('Это заголовок статьи', allow_unicode=True) 'это-заголовок-статьи' Но тогда мы не вписываемся в условие. У нас появилась кириллица в тексте. Так как я часто пишу сайты для русскоязычных пользователей эта проблема весьма актуальна. Я не использую стандартную функцию и всегда пишу свою. Оригинал я не беру в расчёт и пишу полностью свою функцию. И так, по порядку: 🔸1. Исходный текст: >>> text = 'Мой заголовок №10 😁!' Взял специально посложней со специальными символами. 🔸2. Транслит Необходимо сделать транслит всех символов в латиницу. Здесь очень выручает библиотека unidecode. Помимо простого транслита кириллицы в латиницу она умеет преобразовывать спец символы и иероглифы в текстовые аналоги. from unidecode import unidecode >>> unidecode("Ñ Σ ® µ ¶ ¼ 月山") 'N S (r) u P 1/4 Yue Shan' Очень крутая библиотека, советую👍 В нашем случае получаем такое преобразование: >>> text = unidecode(text) >>> print(text) 'Moi zagolovok No. 10 !' Отличный транслит. Смайл просто удалился, хотя я ждал что-то вроде :). Ну и ладно, всë равно невалидные символы. А еще наш код уже поддерживает любой язык, будь то хинди или корейский. 🔸4. Фильтр символов Unidecode не занимается фильтрацией по недопустимым символам. Это мы делаем в следующем шаге через regex. Просто заменим все символы на "_" если они вне указанного диапазона. >>> text = re.sub(r'[^a-zA-Z0-9]+', '_', text) >>> print(text) 'Moi_zagolovok_No_10_' Символ "+" в паттерне выручает когда несколько недопустимых символов идут рядом. Все они заменяются на один символ "_". 🔸5. Slugify Осталось удалить лишние символы по краям и сделать нижний регистр >>> text = text.strip('_').lower() >>> print(text) 'moi_zagolovok_no_10' Получаем отличный slug! 😎 🌎 Полный код в виде функции. ______________ PS. Проверку что в строке остался хоть один допустимый символ я бы вынес в отдельную функцию. #libs#tricks#django

Hashtags

#libs #tricks #django

Резултати

Пронајдени 3 слични објави

Пребарај: #eucommission

当前筛选 #eucommission清除筛选

AI & Law

@ai_and_law · Post #17 · 06.06.2023 г., 07:04

Најди слично Погледај

Transparency in AI-generated content: EU Commissioner calls for labeling EU Commissioner Vera Jourova has highlighted the importance of clearly identifying content that is generated or significantly influenced by AI systems. The proposal to label AI-generated content serves multiple purposes: protecting consumer rights, promoting accountability, and enabling individuals to distinguish between human-created and AI-generated information. The European Commission wants tech companies like Google, Facebook and TikTok to start labeling content created by artificial intelligence without waiting for digital laws to come into effect. As AI becomes more prevalent in content creation, legal concerns arise regarding authenticity, accountability, and the potential for misinformation. By introducing labeling requirements, the EU aims to provide legal clarity, allowing consumers and authorities to better navigate the digital landscape while holding AI systems accountable for the information they generate. While the EU takes a proactive stance in regulating AI-generated content, the implications extend beyond its borders. As AI transcends geographical boundaries, the need for transparent labeling practices becomes crucial on a global scale. International collaboration in developing standardized guidelines can enhance consistency and protect users' rights across jurisdictions. #artificialintelligence#AI#Law#EUCommission

Hashtags

#artificialintelligence #ai #law #eucommission

AI & Law

@ai_and_law · Post #348 · 09.07.2024 г., 07:04

Најди слично Погледај

European Commission's AI Codes of Practice: A Self-Regulation Concern? According to Euractiv, the European Commission plans to let AI model providers draft codes of practice for compliance with the AI Act, with civil society organizations consulted during the process. This approach has sparked concerns about industry self-regulation, as these codes will serve as compliance measures for general-purpose AI models until harmonized standards are set. The Commission may grant EU-wide validity to these codes through an implementing act. Some civil society members worry this could enable Big Tech to essentially write their own rules. The AI Act's language on stakeholder participation in drafting these codes is ambiguous. The Commission has stated that an upcoming call for expressions of interest will clarify how various stakeholders, including civil society, will be involved. However, specifics are still lacking. An external firm will be hired to manage the drafting process, including stakeholder engagement and weekly working group meetings. The AI Office will oversee the process but will primarily focus on approving the final codes. #AIRegulation#EUCommission#AICodes#AIAct#Compliance

Hashtags

#airegulation #eucommission #aicodes #aiact #compliance

AI & Law

@ai_and_law · Post #55 · 13.07.2023 г., 11:20

Најди слично Погледај

Spain takes the lead inshaping EU's AI regulations Spain has assumed the rotating presidency of the EU Council of Ministers and is gearing up to make a significant impact on the future of artificial intelligence regulations in the European Union. As part of their digital priorities, Spain aims to reach a political agreement on the AI Act. In preparation for upcoming negotiations with the EU Council, Parliament, and Commission, Spain has shared its position on key aspects of the Act: 1️⃣ Defining AI: Spain is considering different options, including sticking with the Council's text, aligning with the Parliament's position, or awaiting the OECD's (Organisation for Economic Co-operation and Development) guidance. 2️⃣ Classification of High-Risk Applications: Spain is exploring various possibilities, such as adopting the Parliament's version without the notification of competent authorities, or refining it with binding self-assessment criteria for AI providers. 3️⃣ Addressing Critical Concepts: Spain is examining whether the AI Act is the appropriate framework to address concepts like democracy, the rule of law, and sustainability. 4️⃣ Clarity in Terminology: Spain is assessing the potential introduction of the term 'deployer' to minimize confusion and ensure clear roles and responsibilities within the AI ecosystem. These discussions will inform the trilogue negotiations scheduled for 18 July, where representatives from the Council, Parliament, and Commission will work towards a consensus on the AI Act. #SpainPresidency#AIAgenda#EURegulations#AIAct#AIRegulations#EUCouncil#EUCommission#EUParliament#Trilogue

Hashtags

#spainpresidency #aiagenda #euregulations #aiact #airegulations #eucouncil #eucommission #euparliament #trilogue