TGTGInsighttelegram intelligenceLIVE / telegram public index

TGINSIGHT SIMILAR POSTS

Најди сличен содржај

Изворен канал @pythonotes · Post #65 · 8 апр.

Небольшой трик с регулярными выражениями который редко вижу в чужом коде. Допустим, вам нужно распарсить простой текст и вытащить оттуда пары имя+телефон. Вернуть всё это надо в виде списка словарей. Возьмем очень простой пример текста. >>> text = ''' >>> Alex:8999123456 >>> Mike:+799987654 >>> Oleg:+344456789 >>> ''' Соответственно, для выделения нужных элементов будем использовать группы. Получится такой паттерн: (\w+):([\d+]+) Как мы будем формировать словарь из найденных групп? >>> import re >>> results = [] >>> for match in re.finditer(r"(\w+):([\d+]+)", text): >>> results.append({ >>> "name": match.group(1), >>> "phone": match.group(2) >>> }) >>> print(results) [{'name': 'Alex', 'phone': '8999123456'}, ...] Можно немного сократить запись используя zip >>> results = [] >>> for match in re.finditer(r"(\w+):([\d+]+)", text): >>> results.append(dict(zip(['name', 'phone'], match.groups()))) Но есть способ лучше! Это именованные группы в regex. Можно в паттерне указать имя группы и результат сразу забрать в виде словаря. >>> for match in re.finditer(r"(?P<name>\w+):(?P<phone>[\d+]+)", text): >>> results.append(match.groupdict()) То есть всё что я сделал, это добавил в начале группы (внутри сбокочек) такую запись: (?P<group-name>...) Теперь найденная группа имеет имя и можно обратиться к ней как к элементу списка >>> name = match['name'] Либо забрать сразу весь словарь методом groupdict() >>> match.groupdict() #tricks#regex

Hashtags

#tricks #regex

Резултати

Пронајдени 10 слични објави

Пребарај: #scrapy

当前筛选 #scrapy清除筛选

djangoproject

@djangoproject · Post #453 · 02.10.2017 г., 20:18

Најди слично Погледај

https://medium.com/towards-data-science/using-scrapy-to-build-your-own-dataset-64ea2d7d4673 In short, #Scrapy is a framework built to build web scrapers more easily and relieve the pain of maintaining them. Basically, it allows you to focus on the data extraction using #CSS selectors and choosing XPath expressions and less on the intricate internals of how spiders are supposed to work.

Hashtags

#scrapy #css

有空多睡觉，没空少看书

@kankanshu · Post #16374 · 01.12.2025 г., 20:03

Најди слично Погледај

慕课网实战课-畅销3年的Python分布式爬虫课程-原版提取 #Python爬虫#分布式爬虫#Scrapy 课程基于真实电商网站，带你从零打造分布式爬虫系统，掌握Scrapy-Redis核心技术与反爬策略，附赠源码及三年持续更新内容，实战性强。 💾 获取资源请点击：👉 点我获取慕课网实战课-畅销3年的Python分布式爬虫课程-原版提取👈

Hashtags

#python爬虫 #分布式爬虫 #scrapy

djangoproject

@djangoproject · Post #223 · 05.01.2017 г., 13:36

Најди слично Погледај

#scrapy Scrapy is a fast high-level #web crawling and web scraping framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from #data_mining to #monitoring and #automated_testing. https://github.com/scrapy/scrapy

Hashtags

#scrapy #web #data_mining #monitoring #automated_testing

GitHub Trends

@githubtrending · Post #15387 · 04.01.2026 г., 11:30

Најди слично Погледај

#python#crawler#feapder#feaplat#python#scrapy#spider Feapder is a simple, powerful Python web scraping framework (Python 3.6+) with four spider types for different needs, plus breakpoint resuming, monitoring alerts, browser rendering, and massive data deduplication. Install easily via pip (basic, render, or full versions), create a spider with one command, and run it to fetch/parse sites like Baidu. A management system handles deployment/scheduling. This saves you time by making scraping fast, reliable, and scalable without building everything from scratch. https://github.com/Boris-code/feapder

Hashtags

#python #crawler #feapder #feaplat #scrapy #spider

djangoproject

@djangoproject · Post #317 · 28.04.2017 г., 06:15

Најди слично Погледај

https://www.dunebook.com/7-best-python-libraries-of-2017/ #TensorFlow #Scrapy #Scikit_learn #OpenAI_universe #zappa #Arrow #Theano

Hashtags

#tensorflow #scrapy #scikit_learn #openai_universe #zappa #arrow #theano

Repositorio data science

@repo_science · Post #3180 · 12.05.2023 г., 19:53

Најди слично Погледај

#webScraping#Python#Scrapy 🐍 Scrapy course - Python web scraping for beginners The Scrapy #Beginners Course will teach you everything you need to learn to start scraping websites at scale using #Python Scrapy. Topics - Creating your first #Scrapy spider - #Crawling through websites & scraping data from each page - Cleaning data with Items & Item Pipelines - Saving data to CSV files, #MySQL & #Postgres#databases - Using fake #user-agents & headers to avoid getting blocked - Using #proxies to scale up your web scraping without getting banned - Deploying your #scraper to the cloud & scheduling it to run periodically 🗣️ Joe Kearney. 🔗Link 📢#youtube ⭐️ Resources ⭐️ Course Resources - Scrapy Docs - Course Guide - Course Github - The Python Scrapy Playbook ----- Main channel: @repo_science Coupons: @freecoupons_reposcience -----

Hashtags

#webscraping #python #scrapy #beginners #crawling #mysql #postgres #databases #user #proxies #scraper #youtube

djangoproject

@djangoproject · Post #224 · 07.01.2017 г., 16:53

Најди слично Погледај

#AI #automated_testing #automation #asyncio #atexit #button #concurrency #Coroutines #data_mining #dropdownbox #Debian #decorators #django_cms #form #Google #Gym #intelligence #input #lists #machine_learning #map #Metaprogramming #Micro_services #monitoring #Multipart #multi_touch_apps #multiprocessing #Nodes #numerical #OAuth #package #pytest #python #requests #Requests #satellite #scrapy #scikit_learn #SciPy #searching #submit #selectbox #sessions #TensorFlow #text_boxes #text #telegram #Threads #tuples #Universe #urllib #upload

djangoproject

@djangoproject · Post #298 · 17.04.2017 г., 07:42

Најди слично Погледај

#AI#Artificial_Intelligence #aiohttp #API #AWS #asyncio #audio #automated_testing #automation #atexit #BeeWare #button #client #concurrency #cron #Coroutine #data_analysis #data_mining #data_processing #database #Deep_Learning #Debian #decorator #dispatch #django #dropdownbox #Docker #event #Firefox #form #freeze #functool #Generator #GeoDjango #Google #GPU #Gym #learn #Image_processing #intelligence #input #IOT #lambda #lists #machine_learning #Magenta #map #Metaprogramming #Micro_services #mind #monitoring #MongoDB #Mozilla #Multipart #multi_touch_apps #multiprocessing #Nodes #NoSQL #numeric_computation #numerical #NumPy #OAuth #object_serialization #OCR #overloading #package #parallel #pipeline #protocols #PostGIS #pyAudioAnalysis #PyInstaller #PySide #PyTorch #pytest #python #Pyvideo_archives #Qt #Redis #random #request #REST #satellite #scrapy #scikit_learn #SciPy #searching #submit #selectbox #Selenium #serialization #server #session #socket #sound #task #TensorFlow #text_boxes #text #test #telegram #Thread #transport #tuples #Universe #Unix #urllib #upload #Web

Hashtags

#ai #artificial_intelligence #aiohttp #api #aws #asyncio #audio #automated_testing #automation #atexit #beeware #button #client #concurrency #cron #coroutine #data_analysis #data_mining #data_processing #database #deep_learning #debian #decorator #dispatch #django #dropdownbox #docker #event #firefox #form #freeze #functool #generator #geodjango #google #gpu #gym #learn #image_processing #intelligence #input #iot #lambda #lists #machine_learning #magenta #map #metaprogramming #micro_services #mind #monitoring #mongodb #mozilla #multipart #multi_touch_apps #multiprocessing #nodes #nosql #numeric_computation #numerical #numpy #oauth #object_serialization #ocr #overloading #package #parallel #pipeline #protocols #postgis #pyaudioanalysis #pyinstaller #pyside #pytorch #pytest #python #pyvideo_archives #qt #redis #random #request #rest #satellite #scrapy #scikit_learn #scipy #searching #submit #selectbox #selenium #serialization #server #session #socket #sound #task #tensorflow #text_boxes #text #test #telegram #thread #transport #tuples #universe #unix #urllib #upload #web

djangoproject

@djangoproject · Post #425 · 28.08.2017 г., 03:37

Најди слично Погледај

#AI#Artificial_Intelligence #aiohttp #AngularJS #API #AWS #asyncio #audio #automated_testing #automation #atexit #BeeWare #button #client #concurrency #Coroutine #cron #curl #data_analysis #data_mining #data_processing #database #Deep_Learning #Debian #decorator #dict #dispatch #django #django_cms #dropdownbox #Docker #event #Firefox #form #Generator #GeoDjango #git #Google #GPU #Gym #learn #Image_processing #intelligence #input #IOT #lambda #learn #lists #machine_learning #Magenta #map #Metaprogramming #Micro_services #mind #monitoring #MongoDB #Mozilla #Multipart #multi_touch_apps #multiprocessing #Nodes #NoSQL #numeric_computation #numerical #NumPy #OAuth #object_serialization #OCR #overloading #package #parallel #pipeline #protocols #PostGIS #pyAudioAnalysis #pycon #Pyflakes #PyInstaller #PySide #PyTorch #pytest #python #Pyvideo_archives #Qt #React #Redis #random #request #REST #satellite #scrapy #scikit_learn #SciPy #searching #submit #selectbox #Selenium #serialization #server #socket #task #telegram #TensorFlow #test #text_boxes #text #tuples #unicode #Universe #Unix #urllib #upload #Web

Hashtags

#ai #artificial_intelligence #aiohttp #angularjs #api #aws #asyncio #audio #automated_testing #automation #atexit #beeware #button #client #concurrency #coroutine #cron #curl #data_analysis #data_mining #data_processing #database #deep_learning #debian #decorator #dict #dispatch #django #django_cms #dropdownbox #docker #event #firefox #form #generator #geodjango #git #google #gpu #gym #learn #image_processing #intelligence #input #iot #lambda #lists #machine_learning #magenta #map #metaprogramming #micro_services #mind #monitoring #mongodb #mozilla #multipart #multi_touch_apps #multiprocessing #nodes #nosql #numeric_computation #numerical #numpy #oauth #object_serialization #ocr #overloading #package #parallel #pipeline #protocols #postgis #pyaudioanalysis #pycon #pyflakes #pyinstaller #pyside #pytorch #pytest #python #pyvideo_archives #qt #react #redis #random #request #rest #satellite #scrapy #scikit_learn #scipy #searching #submit #selectbox #selenium #serialization #server #socket #task #telegram #tensorflow #test #text_boxes #text #tuples #unicode #universe #unix #urllib #upload #web

djangoproject

@djangoproject · Post #513 · 30.11.2017 г., 22:00

Најди слично Погледај

#AI#Artificial_Intelligence #AJAX #aiohttp #Anaconda #AngularJS #API #Atom #AWS #asyncio (#Asynchronous) #audio #automated_testing #automation #atexit #BeeWare #Big_Data #bitcoin #blockchain #Bluemix #Brython #button #Celery #client #class #classmethod #concurrency #Coroutine #cron #CSS #curl #data_analysis #data_mining #data_processing #database #Deep_Learning#deep_learning #Debian #decorator #deploy #dict #dispatch #django #django_cms #Django_REST_Framework #dropdownbox #Docker #event #Firefox #Flask #form #functions #Generator #GeoDjango #git #Google #GPU #GUI #Gym #host #HTML #httplib #learn #Image_processing #intelligence #input #Instagram #IOT #iPython #Jupyter #lambda #learn #License #Linux #lists #machine_learning #Magenta #map #Matplotlib #Metaprogramming #Micro_services #Micropython #mind #monitoring #MongoDB #modules #Mozilla #Multipart #multi_touch_apps #multiprocessing #Nodes #NoSQL #numeric_computation #numerical #NumPy #network #neural_network #OAuth #object_serialization #OCR #overloading #package #parallel #pipeline #protocols #PostGIS #pyAudioAnalysis #pycon #Pyflakes #PyInstaller #PyPI #PyQt #PySide #PyTorch #pytest #python #Pyvideo_archives #Qt #Raspberry_Pi #React #Redis #random #request #Regular_Expressions (#re) #REST #RSS #satellite #scikit_learn #SciPy #scrapy #searching #selectbox #Selenium #serialization #server #sessions #single_responsibility_principle #socket #Spark #str #submit #task #telegram #template #TensorFlow #test #text_boxes #text #tuples #unicode #Universe #Unix #unit_test #urllib #upload #uWSGI #Web #WSGI

Hashtags

#ai #artificial_intelligence #ajax #aiohttp #anaconda #angularjs #api #atom #aws #asyncio #asynchronous #audio #automated_testing #automation #atexit #beeware #big_data #bitcoin #blockchain #bluemix #brython #button #celery #client #class #classmethod #concurrency #coroutine #cron #css #curl #data_analysis #data_mining #data_processing #database #deep_learning #debian #decorator #deploy #dict #dispatch #django #django_cms #django_rest_framework #dropdownbox #docker #event #firefox #flask #form #functions #generator #geodjango #git #google #gpu #gui #gym #host #html #httplib #learn #image_processing #intelligence #input #instagram #iot #ipython #jupyter #lambda #license #linux #lists #machine_learning #magenta #map #matplotlib #metaprogramming #micro_services #micropython #mind #monitoring #mongodb #modules #mozilla #multipart #multi_touch_apps #multiprocessing #nodes #nosql #numeric_computation #numerical #numpy #network #neural_network #oauth #object_serialization #ocr #overloading #package #parallel #pipeline #protocols #postgis #pyaudioanalysis #pycon #pyflakes #pyinstaller #pypi #pyqt #pyside #pytorch #pytest #python #pyvideo_archives #qt #raspberry_pi #react #redis #random #request #regular_expressions #re #rest #rss #satellite #scikit_learn #scipy #scrapy #searching #selectbox #selenium #serialization #server #sessions #single_responsibility_principle #socket #spark #str #submit #task #telegram #template #tensorflow #test #text_boxes #text #tuples #unicode #universe #unix #unit_test #urllib #upload #uwsgi #web #wsgi