Skip to content

Instantly share code, notes, and snippets.

@z4r
Last active December 12, 2015 09:48
Show Gist options
  • Select an option

  • Save z4r/4754072 to your computer and use it in GitHub Desktop.

Select an option

Save z4r/4754072 to your computer and use it in GitHub Desktop.
Doctest isn't palindrome

#Doctest isn't palindrome# A brief introduction about Test-driven development using doctest @ Mobile World Congress 2013 in Barcelona.

##Abstract## Talking about Web development, requirements are fickle and ever-changing. The most trivial request could be the straw that breaks the camel's back. Test-driven development ( TDD ) is the best weapon in our hands; the doctest is the simplest tool to explore its efficiency. And let's face it, palindromes fascinate everyone.

###Issue != Requirements### Everyday a developer has to face the problem to understand his/her requestors (employers, line managers, collegues). Unfortunately the request isn't usually clear and comprehensive, because sometimes the problem is not clear to those who ask us to solve it. Language barriers can make the problem worse. However, an example is more enlightening than thousand words. And what if we could write these examples in a language more related to us?

###Doctest### The doctest searches for pieces of text that look like interactive language sessions, and then executes those sessions to verify that they work exactly as shown. Simply put: working examples. In addition to this connection it can help guarantee a working and up-to-date documentation. When is it better not to use them?

  • when we need somewhat complex with common setup/teardown
  • when we try to get a better coverage of all cases
  • when we want to keep tests independant from each other

###Test-driven development### Right now you just need to know the three laws of TDD to follow:

  • You may not write production code until you have written a failing test.
  • You may not write more of a test than is sufficient to fail, and not compiling is failing.
  • You may not write more production code than is sufficient to pass the current failing test.

##But we're here for palindromes## Ok ok I understand; who doesn't love them?

###The Problem### A long time ago, in a galaxy far, far away...

Product Owner: I want something to recognize palindromes!

Just in my mind: A palindrome is a word, phrase, number, or other sequence of units that may be read the same way in either direction, with general allowances for adjustments to punctuation and word dividers.

Me: Sure...! Please could you make me an example of what do you mean with palindromes?

Product Owner: "Level", 12321 and "Dammit, I'm mad!"

Just in my mind: No doubt on the last one!

Me: Ok, I'm ready!

###Love @ the 1st Failure### I know three examples:

  • "Level"
  • 12321
  • "Dammit, I'm mad!"

and the 1st law of TDD says:

  • You may not write production code until you have written a failing test.
"""
>>> is_palindrome("Level")
True
"""

Let's save it a file (eg. palindrome.py) and execute it:

$ python -m doctest -v palindrome.py 
Trying:
    is_palindrome("Level")
Expecting:
    True
**********************************************************************
File "palindrome.py", line 3, in palindrome
Failed example:
    is_palindrome("Level")
Exception raised:
    Traceback (most recent call last):
        ...
    NameError: name 'is_palindrome' is not defined
**********************************************************************
1 items had failures:
   1 of   1 in palindrome
1 tests in 1 items.
0 passed and 1 failed.
***Test Failed*** 1 failures.

Our function will receive one input and it would return True:

def is_palindrome(input):
    """
    >>> is_palindrome("Level")
    True
    """
    return True
$ python -m doctest -v palindrome.py 
Trying:
    is_palindrome("Level")
Expecting:
    True
ok
1 items had no tests:
    palindrome
1 items passed all tests:
   1 tests in palindrome.is_palindrome
1 tests in 2 items.
1 passed and 0 failed.
Test passed.

Am I joking? Of course, NO!

  • You may not write more production code than is sufficient to pass the current failing test.

###Palindrome isn't Palindrome### RED

def is_palindrome(input):
    """
    >>> is_palindrome("Level")
    True
    >>> is_palindrome("Palindrome")
    False
    """
    return True
$ python -m doctest palindrome.py 
**********************************************************************
File "palindrome.py", line 5, in palindrome.is_palindrome
Failed example:
    is_palindrome("Palindrome")
Expected:
    False
Got:
    True
**********************************************************************
1 items had failures:
   1 of   2 in palindrome.is_palindrome
***Test Failed*** 1 failures.

AND GREEN

def is_palindrome(input):
    """ 
    >>> is_palindrome("Level")
    True
    >>> is_palindrome("Palindrome")
    False
    """
    input = input.lower()
    return input == input[::-1]

At this stage we have already identified three responsability:

  • normalization input.lower()
  • reversal input[::-1]
  • comparision ==

but it is too early to think about the refactoring.

###Numbers: a simple test cycle### RED

def is_palindrome(input):
    """ 
    >>> is_palindrome("Level")
    True
    >>> is_palindrome("Palindrome")
    False
    >>> is_palindrome(12321)
    True
    """
    input = input.lower()
    return input == input[::-1]
$ python -m doctest palindrome.py 
**********************************************************************
File "palindrome.py", line 7, in palindrome.is_palindrome
Failed example:
    is_palindrome(12321)
Exception raised:
    Traceback (most recent call last):
        ...
    AttributeError: 'int' object has no attribute 'lower'
**********************************************************************
1 items had failures:
   1 of   3 in palindrome.is_palindrome
***Test Failed*** 1 failures.

AND GREEN

def is_palindrome(input):
    """ 
    >>> is_palindrome("Level")
    True
    >>> is_palindrome("Palindrome")
    False
    >>> is_palindrome(12321)
    True
    """
    input = str(input).lower()
    return input == input[::-1]

###Spaces & co.: Refactoring TIME### RED

def is_palindrome(input):
    """ 
    >>> is_palindrome("Level")
    True
    >>> is_palindrome("Palindrome")
    False
    >>> is_palindrome(12321)
    True
    >>> is_palindrome("Dammit, I'm mad!")
    True
    """
    input = str(input).lower()
    return input == input[::-1]
$ python -m doctest palindrome.py 
**********************************************************************
File "palindrome.py", line 9, in palindrome.is_palindrome
Failed example:
    is_palindrome("Dammit, I'm mad!")
Expected:
    True
Got:
    False
**********************************************************************
1 items had failures:
   1 of   4 in palindrome.is_palindrome
***Test Failed*** 1 failures.

AND UGLY GREEN

def is_palindrome(input):
    """ 
    >>> is_palindrome("Level")
    True
    >>> is_palindrome("Palindrome")
    False
    >>> is_palindrome(12321)
    True
    >>> is_palindrome("Dammit, I'm mad!")
    True
    """
    from string import digits, letters
    input = filter(lambda char: char in digits + letters, str(input).lower())
    return input == input[::-1]

AND THIS TIME: REFACTORING

from string import digits, letters


def is_palindrome(input):
    """ 
    >>> is_palindrome("Level")
    True
    >>> is_palindrome("Palindrome")
    False
    >>> is_palindrome(12321)
    True
    >>> is_palindrome("Dammit, I'm mad!")
    True
    """
    input = normalize(input)
    return input == input[::-1]
    
def normalize(input):
    """ 
    >>> normalize("Level")
    'level'
    >>> normalize("Palindrome")
    'palindrome'
    >>> normalize(12321)
    '12321'
    >>> normalize("Dammit, I'm mad!")
    'dammitimmad'
    """
    return filter(lambda char: char in digits + letters, str(input).lower())

palindrome 1.0 is ready for the first release!

##The devil teaches us his tricks but not how to hide them##

Product Owner: Our Indian Country Manager is passionate about Tamil literature...

Just in my mind: I knew ... encoding

Product Owner: ...and want to use the palindrome function!

Me: Please could you make me an example?

Product Owner: I know you like examples and this time I'm ready too: ழிகழி means Likali and it's palindrome, otherwise ழிகழை means Likalai and it isn't palindrome

Just in my mind: I'll teach him very very well

Me: Ok, I'm ready...again!

##ழிகழி## Expected and Unexpected errors...however RED

# coding=utf-8
from string import digits, letters


def is_palindrome(input):
    """
    >>> is_palindrome("Level")
    True
    >>> is_palindrome("Palindrome")
    False
    >>> is_palindrome(12321)
    True
    >>> is_palindrome("Dammit, I'm mad!")
    True
    >>> is_palindrome("ழிகழி")
    True
    >>> is_palindrome("ழிகழை")
    False
    """
    input = normalize(input)
    return input == input[::-1]

def normalize(input):
    """
    >>> normalize("Level")
    'level'
    >>> normalize("Palindrome")
    'palindrome'
    >>> normalize(12321)
    '12321'
    >>> normalize("Dammit, I'm mad!")
    'dammitimmad'
    >>> normalize("ழிகழி")
    'ழிகழி'
    >>> normalize("ழிகழை")
    'ழிகழை'
    """
    return filter(lambda char: char in digits + letters, str(input).lower())
$ python -m doctest palindrome.py
**********************************************************************
File "palindrome.py", line 17, in palindrome.is_palindrome
Failed example:
    is_palindrome("ழிகழை")
Expected:
    False
Got:
    True
**********************************************************************
File "palindrome.py", line 33, in palindrome.normalize
Failed example:
    normalize("ழிகழி")
Expected:
    True
Got:
    ''
**********************************************************************
File "palindrome.py", line 35, in palindrome.normalize
Failed example:
    normalize("ழிகழை")
Expected:
    False
Got:
    ''
**********************************************************************
2 items had failures:
   1 of   6 in palindrome.is_palindrome
   2 of   6 in palindrome.normalize
***Test Failed*** 3 failures.

The unexpected one is the truness of the empty string, but we can easily solve it. For Tamil issues we need a tokenizer with the power of the unicode to get GREEN.

# coding=utf-8
import unicodedata


def is_palindrome(input):
    """
    >>> is_palindrome("Level")
    True
    >>> is_palindrome("Palindrome")
    False
    >>> is_palindrome(12321)
    True
    >>> is_palindrome("Dammit, I'm mad!")
    True
    >>> is_palindrome("ழிகழி")
    True
    >>> is_palindrome("ழிகழை")
    False
    """
    input = tokenize(input)
    if not input:
        return False
    return input == input[::-1]

def tokenize(input):
    """
    >>> tokenize("Level")
    [u'l', u'e', u'v', u'e', u'l']
    >>> tokenize("Palindrome")
    [u'p', u'a', u'l', u'i', u'n', u'd', u'r', u'o', u'm', u'e']
    >>> tokenize(12321)
    [u'1', u'2', u'3', u'2', u'1']
    >>> tokenize("Dammit, I'm mad!")
    [u'd', u'a', u'm', u'm', u'i', u't', u'i', u'm', u'm', u'a', u'd']
    >>> tokenize("ழிகழி")
    [u'\u0bb4\u0bbf', u'\u0b95', u'\u0bb4\u0bbf']
    >>> tokenize("ழிகழை")
    [u'\u0bb4\u0bbf', u'\u0b95', u'\u0bb4\u0bc8']
    """
    tokens = []
    for char in ''.join(unicode(str(input).lower(), 'utf-8').split()):
        if unicodedata.category(char) == 'Mc':
            tokens[-1] += char
        elif unicodedata.category(char) != 'Po':
            tokens.append(char)
    return tokens

We solved the indian issue and we can come back home with the palindrome 2.0 released...or maybe not. Let me introduce another test tool: COVERAGE.

$ sudo pip install coverage
$ coverage run -m doctest palindrome.py
$ coverage report -m
Name         Stmts   Miss  Cover   Missing
------------------------------------------
palindrome      14      1    93%   22

Lines 21-22: if not input: return False

I mentioned earlier these statements, but during working hours could happen to interrupt the focus (meetings, colleagues, employers, lunch ...). The coverage could help in these situations.

Homework (or the attach): 100% Coverage :)

Name         Stmts   Miss  Cover   Missing
------------------------------------------
palindrome      14      0   100%  
# coding=utf-8
import unicodedata
def is_palindrome(input):
""" Matches palindrome inputs
>>> is_palindrome("Level")
True
>>> is_palindrome("Palindrome")
False
>>> is_palindrome(12321)
True
>>> is_palindrome("Dammit, I'm mad!")
True
>>> is_palindrome("ழிகழி")
True
>>> is_palindrome("ழிகழை")
False
>>> is_palindrome("")
False
"""
input = tokenize(input)
if not input:
return False
return input == input[::-1]
def tokenize(input):
""" Tokenizes input
>>> tokenize("Level")
[u'l', u'e', u'v', u'e', u'l']
>>> tokenize("Palindrome")
[u'p', u'a', u'l', u'i', u'n', u'd', u'r', u'o', u'm', u'e']
>>> tokenize(12321)
[u'1', u'2', u'3', u'2', u'1']
>>> tokenize("Dammit, I'm mad!")
[u'd', u'a', u'm', u'm', u'i', u't', u'i', u'm', u'm', u'a', u'd']
>>> tokenize("ழிகழி")
[u'\u0bb4\u0bbf', u'\u0b95', u'\u0bb4\u0bbf']
>>> tokenize("ழிகழை")
[u'\u0bb4\u0bbf', u'\u0b95', u'\u0bb4\u0bc8']
"""
tokens = []
for char in ''.join(unicode(str(input).lower(), 'utf-8').split()):
if unicodedata.category(char) == 'Mc':
tokens[-1] += char
elif unicodedata.category(char) != 'Po':
tokens.append(char)
return tokens
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment