How to send data to Google Analytics in Python with PyGAMP

The Google Analytics Measurement Protocol API lets you add data to your GA account that hasn’t been triggered by a user visiting a web page. Since it’s so flexible, you...

How to scrape Open Graph protocol data using Python

Many websites include Open Graph protocol data in their document head. This structured data allows social networks, such as Facebook and Twitter, to access specific elements of the page’s content...

How to scrape and parse a robots.txt file using Python

When scraping websites, and when checking how well a site is configured for crawling, it pays to carefully check and parse the site’s robots.txt file. This file, which should be...

How to scrape a site's page titles and meta descriptions

Scraping the titles and meta descriptions from every page on a site can tell you a great deal about its content, the underlying content strategy, or product ranges, and many...

How to scan a site for 404 errors and 301 redirect chains

Both 404 page not found errors and 301 redirect chains can be costly and damaging to the performance of a website. They’re both easy to introduce, especially on ecommerce sites...

How to resize and compress images with TinyPNG

Large, uncompressed images slow down your site, increase bandwidth costs, harm the user experience, and impact search engine rankings. In this project, I’ll show you how you can bulk resize...

How to preprocess text for NLP in four easy steps

There’s often a lot of repetition in many data science projects. In tasks that utilise Natural Language Processing (or NLP), for example, you’ll always need to preprocess your text to...

How to parse XML sitemaps using Python

XML sitemaps are designed to make life easier for search engines by providing an index of a site’s URLs. However, they’re also a useful tool in competitor analysis and allow...

How to parse URL structures using Python

URLs often contain useful information that can be used to analyse a website, a user’s search, or the breakdown of content present in each section. While they often look pretty...