MIGRATED! Python: Automating basic tasks or how to get your computer to do the boring stuff for you

This article has been moved to my new blog!
Here: https://blog.vda.io/automating-python-01/

Introduction

For some time now, people I know have been asking me for some mentoring on the basics of Python. Most of them were somewhat familiar with basic programming but were definitely not in the IT field and were not coding for a living. They were all interested in using Python to script away some of the boring and/or tedious tasks they faced at work.

To assist them with that, I wrote a small series of tutorials on how to use the awesome package ecosystem for Python in order to accomplish things that would help them with their daily work. I am sure some ‘actual’ developers would scoff at the simplicity of the tasks at hand but it strikes me as one of these small-changes / big-impact situation. Even very little automated tasks can improve considerably the daily work and productivity of these people.

In order to help a wider audience automate their own workflows, I decided to post a series of articles based on the tutorials I designed.

Goals

The subjects I want to cover are very ‘business’ and productivity oriented and are designed to provide an example on how to do specific tasks with Python. These tasks will include the following subjects:

  • processing tabular data, loading and saving CSV files,
  • designing some very basic user interface,
  • working with the web, such as getting files, webpages, maybe accessing APIs,
  • an introduction to using SQLite and SQL from Python,

This series of article is **NOT** designed as an introduction to Python and you will need to know some Python to begin with. I will also **NOT** cover how to install Python and get it running on your favorite operating system. If you are interested in either of these subjects before starting my articles, I gathered some links down below in the ‘Setting up’ section that will guide you through the basics.

Practical ‘business cases’ that you will be solving

  • Automation of the data cleaning process in a spreadsheet:  A shared spreadsheet has been used in your company for data entry. Inevitably with such manual work, some fields are not uniform: a Yes/No fields has values like ‘Fuck Yeah!’ instead of ‘Yes’ for example. Your job is to clean all of the data for further analysis and you definitely do not feel like going over the thousands of lines in that file.
  • Computation and compilation of statistics and metrics: Once the spreadsheet suitably clean, you might want to run some sort of analysis on it. You may want to start with very simple statistics like the relative percentage of positive and negative lines and you may further your analysis with clustering and other advanced methods, all the way to machine learning.
  • Gathering and extraction of data over the Web: The data you want to process might only be available as a webpage or as a file (maybe as a PDF?) somewhere on the internet. It might be data put out by a different department in your company or anything available through your browser like the New York Times or your local government’s website. In any case, you need to be able to grab a webpage, a file or even images and then do something with it.
  • Doing advanced processing with official APIs and SQL: Most companies nowadays give access to their services through APIs. If you need to be able to fetch tweets directly from Twitter or access data from a government API, you will be able to !

Setting up your workspace

In this section, I will provide you with a few links to help you setup your workspace and get started with Python. However, since it is not in the scope of this series of articles, I will not dwell on it. The official page for Python for Beginners is a good start : https://www.python.org/about/gettingstarted/.

Installing Python

One of the best recent tutorials on how to install Python on your favorite operating system has been published by Django Girls as a part of their tutorial on Django (a web framework for Python. You can find it here: https://tutorial.djangogirls.org/en/python_installation/.

Installing a code editor

Even though you can code with the standard windows notepad, it is much more comfortable to be using a dedicated editor with at least some syntax highlighting.

I have a personal preference for Sublime Text, it works on OSX, Linux and Windows, it can be downloaded for free but if you like it, I would strongly recommend supporting the developer and buying a license. You can download it here: https://www.sublimetext.com/3.

Getting started with the basics of Python

There are a lot of way to learn Python from scratch out there. You have traditional books, tutorials, videos, MOOCs, podcasts, pretty much anything you can imagine. The python guide has a webpage detailing some of the options at this URL: http://docs.python-guide.org/en/latest/intro/learning/.

A fairly good interactive tutorial (you can test your code directly in your web browser) is available at this URL: https://www.learnpython.org/.