namedtuple Comes in Handy

I've been writing a lot of Python code recently. Oftentimes I struggle with what a method should return when I have to relay more than one value back to the caller. For example:

def PaymentGateway:
    def do_transaction(self, target, amount, bill_code, **kwargs):
        """
        Perform some transaction against the API.

        :return: whether the transaction was successful or not
        :rtype: bool
        """
        # stuff happens here
        try:
            result = self.amount_transaction(tx_details)
            logger.info("Success: CODE=%s Details=%s" % (result.code, result.detail))
            return True
        except GatewayException as ex:
            logger.error("Transaction failed: ERROR=%s reason=%s" % (ex.err_code, ex.message))
            return False

The code that calls do_transaction might look like this:

if payment_gw.do_transaction(subid, amount, bill_code, service_id, ref_code) is True:
    # Hooray! Succe$$!
    report_success("Transaction for %s was successful. Check logs for status code." % subid)
else:
    # Boo
    report_failure("Transaction failed. I don't know why...")

Many times this is fine, but what if the caller needs the details from the amount_transaction result or the GatewayException? A quick solution is to return a dict :

def PaymentGateway:
    def do_transaction(self, target, amount, bill_code, **kwargs):
        """
        Perform some transaction against the API.

        :return: a dict that contains keys 'success', 'code', and 'detail'
        :rtype: dict
        """
        # stuff happens here
        try:
            result = self.amount_transaction(tx_details)
            logger.info("Success: CODE=%s Details=%s" % (result.code, result.detail))
            success_dict = {
                'success': True,
                'code': result.code,
                'detail': result.detail,
            }
            return success_dict
        except GatewayException as ex:
            logger.error("Transaction failed: ERROR=%s reason=%s" % (ex.err_code, ex.message))
            error_dict = {
                'success': False,
                'code': ex.err_code,
                'detail': ex.message,
            }
            return error_dict

It works but it's pretty ad-hoc. The structure of whatever do_transaction returns won't be obvious unless you dig into the code. The caller will end up like:

payment_status = payment_gw.do_transaction(subid, amount, bill_code, service_id, ref_code)
if payment_status['success'] is True:
    # Hooray! Succe$$!
    report_success("Transaction for %s was successful, status code %s" % (subid, payment_status['code']))
else:
    # Boo
    report_failure("Transaction failed, because: %s" % payment_status['detail'])

Now the caller is poluted with literal strings like 'success', 'code' and 'status'. These can be hell to debug, specially if you happen to misspell one of them in your code. Even if you're using an awesome IDE like PyCharm.

An altenative to defining these ad-hoc dict structures is to use namedtuple from the collections package.

from collections import namedtuple

PaymentStatus = namedtuple('PaymentStatus', ['success', 'code', 'detail'])

def PaymentGateway:
    def do_transaction(self, target, amount, bill_code, **kwargs):
        """
        Perform some transaction against the API.

        :return: whether the transaction was successful or not
        :rtype: PaymentStatus
        """
        # stuff happens here
        try:
            result = self.amount_transaction(tx_details)
            logger.info("Success: CODE=%s Details=%s" % (result.code, result.detail))
            return PaymentStatus(True, result.code, result.detail)
        except GatewayException as ex:
            logger.error("Transaction failed: ERROR=%s reason=%s" % (ex.err_code, ex.message))
            return PaymentStatus(False, ex.err_code, ex.message)

namedtuple forces us to be explicit about what do_transaction returns. And explicit is better than implicit. For the caller, this looks like:

payment_status = payment_gw.do_transaction(subid, amount, bill_code, service_id, ref_code)
if payment_status.success is True:
    # Hooray! Succe$$!
    report_success("Transaction for %s was successful, status code %s" % (subid, payment_status.code))
else:
    # Boo
    report_failure("Transaction failed, because: %s" % payment_status.detail)

This is almost as simple as our first example, and is free of string literals. And if you're using PyCharm, you can take advantage of the code completion which will know about the attributes of your new namedtuple class:

/images/pycharm_namedtuple.png

So if your code is littered with string literals as keys for return values from methods that return dict, consider having them return a namedtuple instead.

The Art of Data Science

/images/art-of-data-science-book.thumbnail.png

I will admit, I was pretty stoked yesterday when Roger Peng retweeted my announcement that his new book was available.

In the book, Peng and co-author Elizabeth Matsui walk us through the different activites of data analysis, from formulating questions, basic exploratory data analysis to get a rough feel for the data, to modelling the data with familiar distributions through to basic inference and prediction.

Using R and the datasets that come bundled with it, Peng and Matsui demonstrate how each activity is actually an iterative process itself. At each stage, it's important to evaluate what you already know (or think you know) and revise your expectations based on the data.