Небольшой трик с регулярными выражениями который редко вижу в чужом коде.
Допустим, вам нужно распарсить простой текст и вытащить оттуда пары имя+телефон. Вернуть всё это надо в виде списка словарей. Возьмем очень простой пример текста.
>>> text = '''
>>> Alex:8999123456
>>> Mike:+799987654
>>> Oleg:+344456789
>>> '''
Соответственно, для выделения нужных элементов будем использовать группы. Получится такой паттерн:
(\w+):([\d+]+)
Как мы будем формировать словарь из найденных групп?
>>> import re
>>> results = []
>>> for match in re.finditer(r"(\w+):([\d+]+)", text):
>>> results.append({
>>> "name": match.group(1),
>>> "phone": match.group(2)
>>> })
>>> print(results)
[{'name': 'Alex', 'phone': '8999123456'}, ...]
Можно немного сократить запись используя zip
>>> results = []
>>> for match in re.finditer(r"(\w+):([\d+]+)", text):
>>> results.append(dict(zip(['name', 'phone'], match.groups())))
Но есть способ лучше! Это именованные группы в regex. Можно в паттерне указать имя группы и результат сразу забрать в виде словаря.
>>> for match in re.finditer(r"(?P<name>\w+):(?P<phone>[\d+]+)", text):
>>> results.append(match.groupdict())
То есть всё что я сделал, это добавил в начале группы (внутри сбокочек) такую запись:
(?P<group-name>...)
Теперь найденная группа имеет имя и можно обратиться к ней как к элементу списка
>>> name = match['name']
Либо забрать сразу весь словарь методом groupdict()
>>> match.groupdict()
#tricks#regex
While Zionist warplanes rained terror over #Baalbek Lebanon, our children stood tall, not crying, not running, rather shouting from their classrooms: "We shall never be humiliated."
This is the spirit they cannot bomb.
This is the generation they cannot break.
From the heart of Baalbek, under the sound of explosions, rise voices louder than fear, echoing the promise that we will not kneel.
To every child who stood firm today: you are the heartbeat of Lebanon, the voice of dignity, the living proof that resistance is not taught in books; it’s born in the blood of the free.
هيهات منّا الذّلة✊🏻
Lebanon's suffering Roman ruins
Lebanon’s breath-taking Baalbek ruins – once a must-see attraction for thousands of visitors from around the world – lies almost entirely empty as the combined impact of the global health crisis and the country’s deep economic crisis hit the tourism sector.
#tourism#Lebanon#News#Reuters#Baalbek
Subscribe: http://smarturl.it/reuterssubscribe
Reuters brings you the latest business, finance and breaking news video from around the globe. Our reputation for accuracy and impartiality is unparalleled.
Get the latest news on: http://reuters.com/
Follow Reuters on Facebook: https://www.facebook.com/Reuters
Follow Reuters on Twitter: https://twitter.com/Reuters
Follow Reuters on Instagram: https://www.instagram.com/reuters/?hl=en
➖@reutersworldchannel➖
The Israeli warplanes launched more than 70 airstrikes against Lebanon’s densely-populated Beqaa, Baalbek and Hermel, massacring at least 63 people only on Monday. The Israeli strikes left many more injured and massive destruction across the area.
#Beqaa#Baalbek#Hermel#LebanonUnderAttack#Israel
Watch this to learn how #Zionazi-#propagandists at #Wikipedia are trying to manufacture consent for #Israeli destruction of the ancient city of #Baalbek in #Lebanon
To the #ZioNazi#genocidiares in #Israel, nothing is sacred.
#Baalbek is an ancient #Phoenician city and is home to some of the world’s best-preserved #Roman ruins, such as the Temple of Bacchus.
Baalbek was designated a #UNESCO World Heritage Site in 1984 for its ancient Roman temple complex.
UNESCO describes these temples as “one of the finest examples of Imperial Roman architecture at its apogee” and they draw tourists from around the world.
Tourism supports much of the local population, so naturally the ZioNazis have targetted some of its ancient buildings
The older and more entrenched a building, a street, a monument, the more it is a target.
Erasure and eradication. Displace and replace is the goal.
The world went crazy for the Buddha statues in Bamiyan but remain mute for Baalbek.