Resources
Python
Books
Algorithms and Design Patterns in Python
- Short Examples of Data Structures and Algorithms
- PyPattyrn – Simple library for implementing common design patterns.
- python-patterns – A collection of design patterns in Python.
- sortedcontainers – Fast, pure-Python implementation of SortedList, SortedDict, and SortedSet types.
Python implementation of algorithms and design patterns.
Compatibility
Libraries for migrating from Python 2 to 3.
- Python-Future – The missing compatibility layer between Python 2 and Python 3.
- Python-Modernize – Modernizes Python code for eventual Python 3 migration.
- Six – Python 2 and 3 compatibility utilities.
Cluster Computing
Frameworks and libraries for Cluster Computing.
- PySpark – Apache Spark Python API.
- dask – A flexible parallel computing library for analytic computing.
- faust – A stream processing library, porting the ideas from Kafka Streams to Python.
- luigi – A module that helps you build complex pipelines of batch jobs.
- mrjob – Run MapReduce jobs on Hadoop or Amazon Web Services.
- streamparse – Run Python code against real-time streams of data via Apache Storm.
Computer Vision
Libraries for computer vision.
- OpenCV – Open Source Computer Vision Library.
- pyocr – A wrapper for Tesseract and Cuneiform.
- pytesseract – Another wrapper for Google Tesseract OCR.
- SimpleCV – An open source framework for building computer vision applications.
Concurrency and Parallelism
Libraries for concurrent and parallel execution.
- concurrent.futures – (Python standard library) Process-based "threading" interface.
- multiprocessing – (Python standard library) A high-level interface for asynchronously executing callables.
- eventlet – Asynchronous framework with WSGI support.
- gevent – A coroutine-based Python networking library that uses greenlet.
- SCOOP – Scalable Concurrent Operations in Python.
- Tomorrow – Magic decorator syntax for asynchronous code.
- uvloop – Ultra fast implementation of asyncio event loop on top of libuv.
Deep Learning
Frameworks for Neural Networks and Deep Learning. See: awesome-deep-learning.
- TensorFlow – The most popular Deep Learning framework created by Google.
- Caffe – A fast open framework for deep learning..
- Keras – A high-level neural networks library and capable of running on top of either TensorFlow or Theano.
- MXNet – A deep learning framework designed for both efficiency and flexibility.
- Neupy – Running and testing different Artificial Neural Networks algorithms.
- Pytorch – Tensors and Dynamic neural networks in Python with strong GPU acceleration.
- Serpent.AI – Game agent framework. Use any video game as a deep learning sandbox.
- Theano – A library for fast numerical computation.
Machine Learning
Libraries for Machine Learning. See: awesome-machine-learning.
- H2O – Open Source Fast Scalable Machine Learning Platform.
- Metrics – Machine learning evaluation metrics.
- NuPIC – Numenta Platform for Intelligent Computing.
- scikit-learn – The most popular Python library for Machine Learning.
- Spark ML – Apache Spark‘s scalable Machine Learning library.
- vowpal_porpoise – A lightweight Python wrapper for Vowpal Wabbit.
- xgboost – A scalable, portable, and distributed gradient boosting library.
Data Analysis
Libraries for data analyzing.
- Pandas – A library providing high-performance, easy-to-use data structures and data analysis tools.
- Blaze – NumPy and Pandas interface to Big Data.
- Open Mining – Business Intelligence (BI) in Pandas interface.
- Orange – Data mining, data visualization, analysis and machine learning through visual programming or scripts.
- Optimus – Cleansing, pre-processing, feature engineering, exploratory data analysis and easy Machine Learning with a PySpark backend.
Data Visualization
Libraries for visualizing data. See: awesome-javascript.
- Altair – Declarative statistical visualization library for Python.
- Bokeh – Interactive Web Plotting for Python.
- bqplot – Interactive Plotting Library for the Jupyter Notebook
- ggplot – Same API as ggplot2 for R.
- Matplotlib – A Python 2D plotting library.
- Pygal – A Python SVG Charts Creator.
- PyGraphviz – Python interface to Graphviz.
- PyQtGraph – Interactive and realtime 2D/3D/Image plotting and science/engineering widgets.
- Seaborn – Statistical data visualization using Matplotlib.
- VisPy – High-performance scientific visualization based on OpenGL.
Database
Databases implemented in Python.
- pickleDB – A simple and lightweight key-value store for Python.
- TinyDB – A tiny, document-oriented database.
- ZODB – A native object database for Python. A key-value and object graph database.
Database Drivers
Libraries for connecting and operating databases.
- MySQL – awesome-mysql
- mysqlclient – MySQL connector with Python 3 support (mysql-python fork).
- oursql – A better MySQL connector with support for native prepared statements and BLOBs.
- PyMySQL – A pure Python MySQL driver compatible to mysql-python.
- PostgreSQL – awesome-postgres
- psycopg2 – The most popular PostgreSQL adapter for Python.
- queries – A wrapper of the psycopg2 library for interacting with PostgreSQL.
- txpostgres – Twisted based asynchronous driver for PostgreSQL.
- Other Relational Databases
- NoSQL Databases
- cassandra-driver – The Python Driver for Apache Cassandra.
- HappyBase – A developer-friendly library for Apache HBase.
- kafka-python – The Python client for Apache Kafka.
- py2neo – Python wrapper client for Neo4j’s restful interface.
- PyMongo – The official Python client for MongoDB.
- redis-py – The Python client for Redis.
- Asynchronous Clients
Documentation
Libraries for generating project documentation.
- Sphinx – Python Documentation generator.
- MkDocs – Markdown friendly documentation generator.
- pdoc – Epydoc replacement to auto generate API documentation for Python libraries.
- Pycco – The literate-programming-style documentation generator.
Environment Management
Libraries for Python version and environment management.
- Pipenv – Sacred Marriage of Pipfile, Pip, & Virtualenv.
- p – Dead simple interactive Python version management.
- pyenv – Simple Python version management.
- venv – (Python standard library in Python 3.3+) Creating lightweight virtual environments.
- virtualenv – A tool to create isolated Python environments.
- virtualenvwrapper – A set of extensions to virtualenv.
HTML Manipulation
Libraries for working with HTML and XML.
- BeautifulSoup – Providing Pythonic idioms for iterating, searching, and modifying HTML or XML.
- bleach – A whitelist-based HTML sanitization and text linkification library.
- cssutils – A CSS library for Python.
- html5lib – A standards-compliant library for parsing and serializing HTML documents and fragments.
- lxml – A very fast, easy-to-use and versatile library for handling HTML and XML.
- MarkupSafe – Implements a XML/HTML/XHTML Markup safe string for Python.
- pyquery – A jQuery-like library for parsing HTML.
- untangle – Converts XML documents to Python objects for easy access.
- WeasyPrint – A visual rendering engine for HTML and CSS that can export to PDF.
- xmldataset – Simple XML Parsing.
- xmltodict – Working with XML feel like you are working with JSON.
HTTP
Libraries for working with HTTP.
- grequests – requests + gevent for asynchronous HTTP requests.
- httplib2 – Comprehensive HTTP client library.
- requests – HTTP Requests for Humans.
- treq – Python requests like API built on top of Twisted’s HTTP client.
- urllib3 – A HTTP library with thread-safe connection pooling, file post support, sanity friendly.
Natural Language Processing
Libraries for working with human languages.
- gensim – Topic Modelling for Humans.
- Jieba – Chinese text segmentation.
- langid.py – Stand-alone language identification system.
- NLTK – A leading platform for building Python programs to work with human language data.
- Pattern – A web mining module for the Python.
- polyglot – Natural language pipeline supporting hundreds of languages.
- SnowNLP – A library for processing Chinese text.
- spaCy – A library for industrial-strength natural language processing in Python and Cython.
- TextBlob – Providing a consistent API for diving into common NLP tasks.
- PyTorch-NLP – A toolkit enabling rapid deep learning NLP prototyping for research.
Networking
Libraries for networking programming.
- asyncio – (Python standard library) Asynchronous I/O, event loop, coroutines and tasks.
- diesel – Greenlet-based event I/O Framework for Python.
- pulsar – Event-driven concurrent framework for Python.
- pyzmq – A Python wrapper for the ZeroMQ message library.
- Twisted – An event-driven networking engine.
- txZMQ – Twisted based wrapper for the ZeroMQ message library.
- NAPALM – Cross-vendor API to manipulate network devices.
Package Management
Libraries for package and dependency management.
- pip – The Python package and dependency manager.
- conda – Cross-platform, Python-agnostic binary package manager.
- Curdling – Curdling is a command line tool for managing Python packages.
- pip-tools – A set of tools to keep your pinned Python dependencies fresh.
- wheel – The new standard of Python distribution and are intended to replace eggs.
Queue
Libraries for working with event and task queues.
- celery – An asynchronous task queue/job queue based on distributed message passing.
- huey – Little multi-threaded task queue.
- mrq – Mr. Queue – A distributed worker task queue in Python using Redis & gevent.
- rq – Simple job queues for Python.
- simpleq – A simple, infinitely scalable, Amazon SQS based queue.
Recommender Systems
Libraries for building recommender systems.
- annoy – Approximate Nearest Neighbors in C++/Python optimized for memory usage.
- fastFM – A library for Factorization Machines.
- implicit – A fast Python implementation of collaborative filtering for implicit datasets.
- libffm – A library for Field-aware Factorization Machine (FFM).
- LightFM – A Python implementation of a number of popular recommendation algorithms.
- Spotlight – Deep recommender models using PyTorch.
- surprise – A scikit for building and analyzing recommender systems.
- TensorRec – A Recommendation Engine Framework in TensorFlow.
RESTful API
Libraries for developing RESTful APIs.
- Django
- django-rest-framework – A powerful and flexible toolkit to build web APIs.
- django-tastypie – Creating delicious APIs for Django apps.
- Flask
- eve – REST API framework powered by Flask, MongoDB and good intentions.
- flask-api-utils – Taking care of API representation and authentication for Flask.
- flask-api – Browsable Web APIs for Flask.
- flask-restful – Quickly building REST APIs for Flask.
- flask-restless – Generating RESTful APIs for database models defined with SQLAlchemy.
- Pyramid
- cornice – A RESTful framework for Pyramid.
- Framework agnostic
- falcon – A high-performance framework for building cloud APIs and web app backends.
- hug – A Python3 framework for cleanly exposing APIs over HTTP and the Command Line with automatic documentation and validation.
- restless – Framework agnostic REST framework based on lessons learned from Tastypie.
- ripozo – Quickly creating REST/HATEOAS/Hypermedia APIs.
- sandman – Automated REST APIs for existing database-driven systems.
- apistar – A smart Web API framework, designed for Python 3.
Science
Libraries for scientific computing.
- astropy – A community Python library for Astronomy.
- bcbio-nextgen – Providing best-practice pipelines for fully automated high throughput sequencing analysis.
- bccb – Collection of useful code related to biological analysis.
- Biopython – Biopython is a set of freely available tools for biological computation.
- cclib – A library for parsing and interpreting the results of computational chemistry packages.
- Colour – A colour science package implementing a comprehensive number of colour theory transformations and algorithms.
- NetworkX – A high-productivity software for complex networks.
- NIPY – A collection of neuroimaging toolkits.
- NumPy – A fundamental package for scientific computing with Python.
- Open Babel – A chemical toolbox designed to speak the many languages of chemical data.
- ObsPy – A Python toolbox for seismology.
- PyDy – Short for Python Dynamics, used to assist with workflow in the modeling of dynamic motion.
- PyMC – Markov Chain Monte Carlo sampling toolkit.
- QuTiP – Quantum Toolbox in Python.
- RDKit – Cheminformatics and Machine Learning Software.
- SciPy – A Python-based ecosystem of open-source software for mathematics, science, and engineering.
- statsmodels – Statistical modeling and econometrics in Python.
- SymPy – A Python library for symbolic mathematics.
- Zipline – A Pythonic algorithmic trading library.
- SimPy – A process-based discrete-event simulation framework.
Search
Libraries and software for indexing and performing search queries on data.
- django-haystack – Modular search for Django.
- elasticsearch-dsl-py – The official high-level Python client for Elasticsearch.
- elasticsearch-py – The official low-level Python client for Elasticsearch.
- esengine – ElasticSearch ODM (Object Document Mapper) for Python.
- pysolr – A lightweight Python wrapper for Apache Solr (incl. SolrCloud awareness).
- solrpy – A Python client for solr.
- Whoosh – A fast, pure Python search engine library.
Serialization
Libraries for serializing complex data types
- marshmallow – marshmallow is an ORM/ODM/framework-agnostic library for converting complex datatypes, such as objects, to and from native Python datatypes.
Serverless Frameworks
Frameworks for developing serverless Python code.
- apex – Build, deploy, and manage AWS Lambda functions with ease.
- python-lambda – A toolkit for developing and deploying Python code in AWS Lambda.
- Zappa – A tool for deploying WSGI applications on AWS Lambda and API Gateway.
Document Manipulation
Libraries for parsing and manipulating specific text formats.
- General
- tablib – A module for Tabular Datasets in XLS, CSV, JSON, YAML.
- Office
- Marmir – Takes Python data structures and turns them into spreadsheets.
- openpyxl – A library for reading and writing Excel 2010 xlsx/xlsm/xltx/xltm files.
- pyexcel – Providing one API for reading, manipulating and writing csv, ods, xls, xlsx and xlsm files.
- python-docx – Reads, queries and modifies Microsoft Word 2007/2008 docx files.
- python-pptx – Python library for creating and updating PowerPoint (.pptx) files.
- relatorio – Templating OpenDocument files.
- unoconv – Convert between any document format supported by LibreOffice/OpenOffice.
- XlsxWriter – A Python module for creating Excel .xlsx files.
- xlwings – A BSD-licensed library that makes it easy to call Python from Excel and vice versa.
- xlwt / xlrd – Writing and reading data and formatting information from Excel files.
- Markdown
- Mistune – Fastest and full featured pure Python parsers of Markdown.
- Python-Markdown – A Python implementation of John Gruber’s Markdown.
- YAML
- PyYAML – YAML implementations for Python.
- CSV
- csvkit – Utilities for converting to and working with CSV.
- Archive
- unp – A command line tool that can unpack archives easily.
Testing
Libraries for testing codebases and generating test data.
- Testing Frameworks
- hypothesis – Hypothesis is an advanced Quickcheck style property based testing library.
- mamba – The definitive testing tool for Python. Born under the banner of BDD.
- nose – A nicer unittest for Python.
- nose2 – The successor to nose, based on unittest2.
- pytest – A mature full-featured Python testing tool.
- Robot Framework – A generic test automation framework.
- unittest – (Python standard library) Unit testing framework.
- Test Runners
- GUI / Web Testing
- locust – Scalable user load testing tool written in Python.
- PyAutoGUI – PyAutoGUI is a cross-platform GUI automation Python module for human beings.
- Selenium – Python bindings for Selenium WebDriver.
- sixpack – A language-agnostic A/B Testing framework.
- splinter – Open source tool for testing web applications.
- Mock
- doublex – Powerful test doubles framework for Python.
- freezegun – Travel through time by mocking the datetime module.
- httmock – A mocking library for requests for Python 2.6+ and 3.2+.
- httpretty – HTTP request mock tool for Python.
- mock – (Python standard library) A mocking and patching library.
- Mocket – Socket Mock Framework plus HTTP[S]/asyncio/gevent mocking library with recording/replaying capability.
- responses – A utility library for mocking out the requests Python library.
- VCR.py – Record and replay HTTP interactions on your tests.
- Object Factories
- factory_boy – A test fixtures replacement for Python.
- mixer – Another fixtures replacement. Supported Django, Flask, SQLAlchemy, Peewee and etc.
- model_mommy – Creating random fixtures for testing in Django.
- Code Coverage
- coverage – Code coverage measurement.
- Fake Data
- Error Handler
- FuckIt.py – FuckIt.py uses state-of-the-art technology to make sure your Python code runs whether it has any right to or not.
Text Processing
Libraries for parsing and manipulating plain texts.
- General
- chardet – Python 2/3 compatible character encoding detector.
- difflib – (Python standard library) Helpers for computing deltas.
- ftfy – Makes Unicode text less broken and more consistent automagically.
- fuzzywuzzy – Fuzzy String Matching.
- Levenshtein – Fast computation of Levenshtein distance and string similarity.
- pangu.py – Spacing texts for CJK and alphanumerics.
- pyfiglet – An implementation of figlet written in Python.
- pypinyin – Convert Chinese hanzi to pinyin.
- shortuuid – A generator library for concise, unambiguous and URL-safe UUIDs.
- textdistance – Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage.
- unidecode – ASCII transliterations of Unicode text.
- uniout – Print readable chars instead of the escaped string.
- xpinyin – A library to translate Chinese hanzi (漢字) to pinyin (拼音).
- Slugify
- awesome-slugify – A Python slugify library that can preserve unicode.
- python-slugify – A Python slugify library that translates unicode to ASCII.
- unicode-slugify – A slugifier that generates unicode slugs with Django as a dependency.
- Parser
- phonenumbers – Parsing, formatting, storing and validating international phone numbers.
- PLY – Implementation of lex and yacc parsing tools for Python.
- Pygments – A generic syntax highlighter.
- pyparsing – A general purpose framework for generating parsers.
- python-nameparser – Parsing human names into their individual components.
- python-user-agents – Browser user agent parser.
- sqlparse – A non-validating SQL parser.
Third-party APIs
Libraries for accessing third party services APIs. See: List of Python API Wrappers and Libraries.
- apache-libcloud – One Python library for all clouds.
- boto3 – Python interface to Amazon Web Services.
- django-wordpress – WordPress models and views for Django.
- facebook-sdk – Facebook Platform Python SDK.
- facepy – Facepy makes it really easy to interact with Facebook’s Graph API
- gmail – A Pythonic interface for Gmail.
- google-api-python-client – Google APIs Client Library for Python.
- gspread – Google Spreadsheets Python API.
- twython – A Python wrapper for the Twitter API.
Audio
Libraries for manipulating audio.
- audiolazy – Expressive Digital Signal Processing (DSP) package for Python.
- audioread – Cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding.
- beets – A music library manager and MusicBrainz tagger.
- eyeD3 – A tool for working with audio files, specifically MP3 files containing ID3 metadata.
- id3reader – A Python module for reading MP3 meta data.
- m3u8 – A module for parsing m3u8 file.
- mingus – An advanced music theory and notation package with MIDI file and playback support.
- mutagen – A Python module to handle audio metadata.
- pyAudioAnalysis – Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications
- pydub – Manipulate audio with a simple and easy high level interface.
- pyechonest – Python client for the Echo Nest API.
- talkbox – A Python library for speech/signal processing.
- TimeSide – Open web audio processing framework.
- tinytag – A library for reading music meta data of MP3, OGG, FLAC and Wave files.
Video
Libraries for manipulating video and GIFs.
- moviepy – A module for script-based movie editing with many formats, including animated GIFs.
- scikit-video – Video processing routines for SciPy.
WSGI Servers
WSGI-compatible web servers.
- bjoern – Asynchronous, very fast and written in C.
- fapws3 – Asynchronous (network side only), written in C.
- gunicorn – Pre-forked, partly written in C.
- meinheld – Asynchronous, partly written in C.
- netius – Asynchronous, very fast.
- rocket – Multi-threaded.
- uWSGI – A project aims at developing a full stack for building hosting services, written in C.
- waitress – Multi-threaded, powers Pyramid.
- Werkzeug – A WSGI utility library for Python that powers Flask and can easily be embedded into your own projects.
Web Content Extracting
Libraries for extracting web contents.
- Haul – An Extensible Image Crawler.
- html2text – Convert HTML to Markdown-formatted text.
- lassie – Web Content Retrieval for Humans.
- micawber – A small library for extracting rich content from URLs.
- newspaper – News extraction, article extraction and content curation in Python.
- python-goose – HTML Content/Article Extractor.
- python-readability – Fast Python port of arc90’s readability tool.
- requests-html – Pythonic HTML Parsing for Humans.
- sanitize – Bringing sanity to world of messed-up data.
- sumy – A module for automatic summarization of text documents and HTML pages.
- textract – Extract text from any document, Word, PowerPoint, PDFs, etc.
- toapi – Every web site provides APIs.
Web Crawling & Web Scraping
Libraries to automate data extraction from websites.
- cola – A distributed crawling framework.
- Demiurge – PyQuery-based scraping micro-framework.
- feedparser – Universal feed parser.
- Grab – Site scraping framework.
- MechanicalSoup – A Python library for automating interaction with websites.
- portia – Visual scraping for Scrapy.
- pyspider – A powerful spider system.
- RoboBrowser – A simple, Pythonic library for browsing the web without a standalone web browser.
- Scrapy – A fast high-level screen scraping and web crawling framework.
Web Frameworks
Full stack web frameworks.
- Django – The most popular web framework in Python.
- Flask – A microframework for Python.
- Pyramid – A small, fast, down-to-earth, open source Python web framework.
- Sanic – Web server that’s written to go fast.
- Tornado – A Web framework and asynchronous networking library.
- Vibora – Fast, efficient and asynchronous Web framework inspired by Flask.
WebSocket
Libraries for working with WebSocket.
- AutobahnPython – WebSocket & WAMP for Python on Twisted and asyncio.
- Crossbar – Open-source Unified Application Router (Websocket & WAMP for Python on Autobahn).
- django-channels – Developer-friendly asynchrony for Django.
- django-socketio – WebSockets for Django.
- WebSocket-for-Python – WebSocket client and server library for Python 2 and 3 as well as PyPy.
Resources
- @codetengu
- @getpy
- @importpython
- @planetpython
- @pycoders
- @pypi
- @pythontrending
- @PythonWeekly
- @TalkPython
- @realpython
Websites
- /r/CoolGithubProjects
- /r/Python
- Awesome Python @LibHunt
- Django Packages
- Full Stack Python
- PyPI Ranking
- Python 3 Wall of Superpowers
- Python Hackers
- Python ZEEF
- Real Python