Python for Scientists – Simple Script Framework

How do your scripts flow? Can others easily read and follow your code? Have you ever hesitated to hand somebody your code because it’s “messy” or needs documented? I think you’re getting the idea of where I’m going with this.

Every Python project needs at least three things. A consistent framework, code comments (I use code and scripts interchangeably), and a README. I have driven home the need for a README in the past, but it’s worth mentioning again. You will be forever grateful to your past self when you open an old Python script and can easily refresh yourself in 10-15 minutes. If you don’t comment, future you is going to waste at least a few hours figuring out what in the world past you was thinking! We don’t want that, so lets get to it.

Topics covered in this post

  • Script Framework
  • Comments
  • Variable Naming Conventions
  • Introduction to Functions
  • Introduction to Whitespace

Click here for a list of related posts.

Updated: 2020-01-28


Frameworks

The top seems like a great starting place. I tend to include basic information about the scripts purpose, who wrote it, when, and when it was updated. Sometimes I leave off the updated section because Git will track that information for me. Here’s a simple way to start every script

"""
This is an example header for an awesome project

Author: AtmoGuy
Created: 2020-01-28
"""

Write a sentence or two about the core purpose of the script at the top. You can write more, but I place details inside functions and directly above tricky sections. Triple quotes (either single or double quotes) indicate a block comment. You need two sets; a triple quote to start the comment and a second triple quote to end it.

The next step is to separate the main sections of your code. Hashtags are single line comments in Python. Anything to the right of a hashtag (on the same line) is not executed. I like to use a Main section, but you don’t have to use one. Code that is in the Main section will only run if you run the script directly. For example, you run script A which imports a function from script B. Only the functions imported from script B will be ran. Code within the main section of script B will not run.

"""
This is an example header for an awesome project

Author: AtmoGuy
Created: 2020-01-28
"""

# /// Imports ///

# /// Functions ///

# /// Main ///

The # /// Imports /// section is where all package imports go. So far so good, right? It can be easy to import packages randomly right before they are needed, but let’s keep things tidy and keep them in one place. It’s easier to keep track of what has been imported using this method.

Functions? What’s a function? We will use a simple function for now and go deeper into functions in a later post. Place all of your functions in the # /// Functions /// to make sure they are called before they are needed. If you don’t, failure will ensue!

All functions start with def, then a space, a descriptive function name, and then a list of inputs (input1, input2, input3). Here’s an example function we will use in our example script.

def square_number(number):
    """
    Purpose
    -------
    Squares a number

    Parameters
    ----------
    number: float
        A whole number from 1 to almost infinity

    Returns
    -------
    new_number: float
        The square of the input number 
    """

    new_number = number ** 2
    return(new_number)

The name of the function is square_number and it takes a single value (number). The name of the variable in parenthesis does not have to be the same name as the variable passed to it! For example, it is acceptable to pass a variable named my_number to the square_number function. Use a natural flow of language to name functions and try not to use abbreviations. Full words are easier to come back to after a long break from a script. Okay, if your word is antidisestablishmentarianism maybe you can abbreviate.

Notice that lines 2-19 are indented. Python requires proper whitespace (indentation). All code that is part of the function must be indented consistently. You can use tab to indent, but I prefer four spaces. Tab can sometimes cause problems that don’t exist when spaces are utilized. Whitespace is also required when loops are used. I think all code should have proper indentation because code is easier to read when everything is indented properly. Python simply requires correct use of whitespace or it won’t throw an error and quit.

Another thing to note is the use of block comments. This is a simple function that does not have any parameters or attributes. Here’s a template for how I document functions:

def some_function(inputA):
    """
    Purpose
    -------
    This function does something

    Parameters
    ----------
    inputA: float
        short description

    Attributes
    ----------
    secondary_variable: float
        short description

    Returns
    -------
    output: float
        short description
    """
    secondary_variable = inputA + 5
    output = secondary_variable * 8
    return(output)

The example looks silly because the comment is longer than the actual function, but that’s okay. Parameters are inputs into the function. In this case, inputA. Attributes are variables only used inside the function. They are not returned and are removed from memory after the function is complete. The Returns section is information about what is passed back out of the function. The Parameters and Attributes headings are not always required. It’s good practice to provide the shape of the variables either directly after the variable type or in the short description.

Variable Names

Be explicit! Codes are easier to follow if variable names are fully spelled. Temperature is a commonly abbreviated variable in atmospheric sciences. It’s a four syllable word. Abbreviate it, right? No! Let’s look at a few reasonable abbreviations: temp, tmp, t, and at. But temporary variables are also handy to have at times. What would we name that variable, temp? But that’s already used for air temperature.

It’s appealing to abbreviate variable names. You will type them numerous times. It’s faster, you say. It only takes a few seconds to type out an entire word opposed to an abbreviation. Let’s assume you type that variable name 500 times. In my opinion this is an extreme amount and if you’re above that maybe you should consider refactoring (re-write to be more efficient). With a 2 seconds per word, that’s an extra 1000 seconds or about 15-20 minutes of “wasted” time.

Fast forward a few months. Your professor/boss want’s you to double check an analysis to ensure it’s correct because your code is going into production or being published. You dive into the code only to find a ton of abbreviated variables. It will likely take you at least half a day to refresh yourself if you didn’t provide comments/documentation and even longer if variable names are abbreviated. The 15-20 minutes saved on the front end can cost you hours to days on the back end. Please, don’t abbreviate variable names.

Main

As discussed earlier, code that is under the Main section will only be ran if the file is ran directly. It’s easier to explain by example.

# other sections remove for simplification

# /// Main ///
if __name__=="__main__":
    print('Hello World')

This file is named “helloWorld.py”. If we run it with python helloWOrld.py, the code under if __name__=="__main__" will execute and it will print Hello World. However if we had a second file named “example.py” and imported the square_number function from “helloWorld.py”, then Hello World would not be printed. The __name__ variable is special. Whenever Python runs a script it sets the __name__ variable automatically. This is not something you have to do on your own. If you’re still confused, here’s another explanation.

Full Example

We have talked about each section, now let’s set up a full example.

"""
This is an example header for an awesome project

Author: AtmoGuy
Created: 2020-01-28
"""

# /// Imports ///
import os


# /// Functions ///
def square_number(number):
    """
    Purpose
    -------
    Squares a number

    Parameters
    ----------
    number: float
        A whole number from 1 to almost infinity

    Returns
    -------
    new_number: float
        The square of the input number 
    """

    new_number = number ** 2
    return(new_number)


# /// Main ///
if __name__=="__main__":
    print('Hello World')
    print("We are here {}'.format(os.getcwd())
    my_number = 2
    print('My number is: {}'.format(my_number)
    my_number_squared = square_number(my_number)
    print('My number squared is: {}'.format(my_number_squared))

The top section provides a brief overview of the script. Imports shows a single import of os which provides various operating system functionality into Python. To run this script directly, activate (or create) a python environment within an Anaconda or command prompt, and navigate to the directory where the script is saved. Type python helloWorld.py and press the Enter key. Python will print the output on the prompt. I used variables with underscores between words, but camel case (ex. myNumberSquared) is also fine. Some people are passionate about which method is used. Use whichever feels right to you.

Print statements have also been introduced. This will print to the prompt so you can read it as soon as Python reaches the print statement. The brackets {} are a placeholder for a variable. You can set precision (number of decimal places) within the brackets. This site has a good rundown of the functionality of .format. Be sure to use the new style (.format) instead of the old style (%) because eventually the old style will be removed and you will have to update all of your scripts.


Remember, this is only an example. Go ahead and tweak things to your liking. Whatever you use, be consistent! It helps to create a template like the one above (without any functions or print statements) and save it. Every time you want to start a new script simply copy/paste the template into your respective code directory and you’re ready to go!

Related Posts

Liked it? Take a second to support AtmoGuy on Patreon!
Become a patron at Patreon!

Comments are closed.

Proudly powered by WordPress | Theme: Baskerville 2 by Anders Noren.

Up ↑