Најди сличен содржај

Изворен канал @pythonotes · Post #121 · 20 јул.

Регулярно требуется преобразовать какой-либо текст в максимально совместимый текст для URL, имени файла, имени объекта в каком-то софте и тд. Требования совместимости простые: в тексте должны быть только допустимые символы. Обычно это a-z, 0-9 и "_" или "-". То есть, только прописные буквы латинского алфавита и цифры (как пример). Допустим, нам нужно название статьи в блоге преобразовать в slug для добавления его в URL этой статьи. Как это лучше всего сделать? В Django по умолчанию есть готовая функция slugify для таких случаев. Но я её никогда не использую. Почему? Потому что её недостаточно! Приведём пример >>> from django.utils.text import slugify >>> slugify('This is a Title') 'this-is-a-title' Пока всё отлично >>> slugify('This is a "Title!"') 'this-is-a-title' Спец символы удалились, всё хорошо. >>> slugify('Это заголовок статьи') '' Вот и приехали 😢. Если текст не английский то буквы просто игнорируются. Можно это поправить >>> slugify('Это заголовок статьи', allow_unicode=True) 'это-заголовок-статьи' Но тогда мы не вписываемся в условие. У нас появилась кириллица в тексте. Так как я часто пишу сайты для русскоязычных пользователей эта проблема весьма актуальна. Я не использую стандартную функцию и всегда пишу свою. Оригинал я не беру в расчёт и пишу полностью свою функцию. И так, по порядку: 🔸1. Исходный текст: >>> text = 'Мой заголовок №10 😁!' Взял специально посложней со специальными символами. 🔸2. Транслит Необходимо сделать транслит всех символов в латиницу. Здесь очень выручает библиотека unidecode. Помимо простого транслита кириллицы в латиницу она умеет преобразовывать спец символы и иероглифы в текстовые аналоги. from unidecode import unidecode >>> unidecode("Ñ Σ ® µ ¶ ¼ 月山") 'N S (r) u P 1/4 Yue Shan' Очень крутая библиотека, советую👍 В нашем случае получаем такое преобразование: >>> text = unidecode(text) >>> print(text) 'Moi zagolovok No. 10 !' Отличный транслит. Смайл просто удалился, хотя я ждал что-то вроде :). Ну и ладно, всë равно невалидные символы. А еще наш код уже поддерживает любой язык, будь то хинди или корейский. 🔸4. Фильтр символов Unidecode не занимается фильтрацией по недопустимым символам. Это мы делаем в следующем шаге через regex. Просто заменим все символы на "_" если они вне указанного диапазона. >>> text = re.sub(r'[^a-zA-Z0-9]+', '_', text) >>> print(text) 'Moi_zagolovok_No_10_' Символ "+" в паттерне выручает когда несколько недопустимых символов идут рядом. Все они заменяются на один символ "_". 🔸5. Slugify Осталось удалить лишние символы по краям и сделать нижний регистр >>> text = text.strip('_').lower() >>> print(text) 'moi_zagolovok_no_10' Получаем отличный slug! 😎 🌎 Полный код в виде функции. ______________ PS. Проверку что в строке остался хоть один допустимый символ я бы вынес в отдельную функцию. #libs#tricks#django

Hashtags

#libs #tricks #django

Резултати

Пронајдени 5 слични објави

Пребарај: #aisecurity

当前筛选 #aisecurity清除筛选

AI & Law

@ai_and_law · Post #651 · 05.09.2025 г., 07:04

Најди слично Погледај

📖LegalPwn: Exploiting AI Guardrails Through Legalese Researchers at security firm Pangea have revealed a new vulnerability in large language models (LLMs) called "LegalPwn". By embedding adversarial instructions in legal documents, attackers can bypass model safeguards and manipulate outputs. During testing, models initially flagged malicious code as dangerous but, after exposure to “legal” text containing hidden instructions, began classifying the same code as harmless — even recommending execution in some cases. Live tests showed "LegalPwn" could bypass AI-driven security tools like Google's gemini-cli, causing models to misclassify malicious scripts and, in one instance, suggest a reverse shell be run on the user’s system. While Anthropic’s Claude, Microsoft’s Phi, and Meta’s Llama Guard resisted the attack, OpenAI’s GPT-4o, Google’s Gemini 2.5, and xAI’s Grok were less successful. Pangea recommends countermeasures like adversarial training, enhanced input validation, and human-in-the-loop oversight to mitigate such risks. #AISecurity#AIEthics

Hashtags

#aisecurity #aiethics

AI & Law

@ai_and_law · Post #648 · 02.09.2025 г., 07:04

Најди слично Погледај

📖AI Adoption and the Unseen Cost of Security Breaches A new Infosys survey reveals that 95% of executives worldwide have already faced security incidents linked to enterprise AI tools — with 77% of those incidents causing direct financial losses. These numbers highlight that security is not a theoretical risk but a measurable and recurring reality in the enterprise AI ecosystem. While many companies are moving forward with responsible AI initiatives, executives also voice growing concern about reputational damage tied to external use of these systems. #AISecurity#ResponsibleAI#AIGovernance

Hashtags

#aisecurity #responsibleai #aigovernance

AI & Law

@ai_and_law · Post #821 · 06.05.2026 г., 07:04

Најди слично Погледај

🇺🇸U.S. Targets Adversarial Distillation of AI Models The United States has issued a memo addressing risks of adversarial distillation of its AI models by foreign actors, with particular concern regarding activities linked to China. The document outlines federal measures aimed at countering unauthorized, industrial-scale extraction of model capabilities. Planned actions include sharing intelligence with U.S. AI companies on foreign distillation attempts, improving coordination within the private sector, and developing joint best practices to detect, mitigate, and respond to such activities. The government also plans to explore mechanisms to hold foreign actors accountable for large-scale distillation campaigns. The memo signals increased federal involvement in protecting AI systems from external exploitation and frames adversarial distillation as a growing issue in international AI competition. #AIRegulation#AISecurity#Geopolitics#AIGovernance#TechPolicy

Hashtags

#airegulation #aisecurity #geopolitics #aigovernance #techpolicy

AI & Law

@ai_and_law · Post #638 · 19.08.2025 г., 07:04

Најди слично Погледај

🇫🇷🇩🇪Franco-German Guidance on Zero-Trust LLM Security France’s Agence nationale de la sécurité des systèmes d’information (ANSSI) and Germany’s Federal Office for Information Security (BSI) have jointly issued a paper on applying zero-trust principles to large language models. The document identifies common design vulnerabilities and operational risks in LLM deployment, stressing the need for a security architecture that assumes no implicit trust. The recommendations focus on three key safeguards: ✔️ restricting system access rights to the minimum necessary, ✔️ increasing transparency in algorithmic decision-making, and ✔️ ensuring continuous human oversight. This coordinated stance from two of Europe’s leading cybersecurity authorities signals a growing emphasis on proactive governance of AI systems at the infrastructure level. #AIsecurity#LLM#ZeroTrust#CyberRegulation

Hashtags

#aisecurity #llm #zerotrust #cyberregulation

AI & Law

@ai_and_law · Post #212 · 12.01.2024 г., 08:04

Најди слично Погледај

NIST Issues Urgent Report on Escalating Threat of AI Attacks Hello, dear subscribers! The National Institute of Standards and Technology (NIST) has released a critical report titled "Adversarial Machine Learning: A Taxonomy and Terminology of Attacks and Mitigations," sounding the alarm on the intensifying threat landscape targeting artificial intelligence systems. In the face of increasingly powerful yet vulnerable AI systems, the report outlines the technique of adversarial machine learning, wherein attackers manipulate AI systems through subtle tactics with potentially catastrophic consequences. The document categorizes these attacks based on attackers' goals, capabilities, and knowledge of the target AI system. Concerns include "data poisoning" and "backdoor attacks," exploiting vulnerabilities in AI system development and deployment. #NIST#AIAttacks#AISecurity#ThreatLandscape#MachineLearning**

Hashtags

#nist #aiattacks #aisecurity #threatlandscape #machinelearning