This example explains Python decorators in the context of data science. The example acts as a quick reminder, rather than a complete guide.
Consider a Pandas DataFrame about posts on a social media. The DataFrame, called posts, contains a column with the number of likes for each post.
| post_id | ... | likes | ... |
|---|---|---|---|
| 1 | ... | 43 | ... |
| 2 | ... | 92 | ... |
| 3 | ... | 54 | ... |
The following function calculates the average number of likes per post.
def average_likes(data):
return data['likes'].mean()We would like to decorate the function with a decorator checking that the likes column has integer type. Such a decorated function may look like the following.
@data_column_has_int_type('likes')
def average_likes(data):
return data['likes'].mean()The decorator can be defined in the following way.
def data_column_has_int_type(column):
def decorator(function):
def wrapper(*args, **kwargs):
data = args[0]
if not pandas.api.types.is_integer_dtype(data[column]):
raise ValueError(f"Column {column} does not have integer type.")
return function(*args, **kwargs)
return wrapper
return decoratorThe decorated function, average_likes, is equivalent to:
data_column_has_int_type('likes')(average_likes)(posts)This composition of functions unwraps as:
-
data_column_has_int_type('likes')⟶decoratorwithcolumnset to'likes', equivalent to:def decorator(function): def wrapper(*args, **kwargs): data = args[0] if not pandas.api.types.is_integer_dtype(data['likes']): # <- See change raise ValueError(f"Column {column} does not have integer type.") return function(*args, **kwargs) return wrapper
-
decorator(average_likes)⟶wrapperwithfunctionset to'average_likes', equivalent to:def wrapper(*args, **kwargs): data = args[0] if not pandas.api.types.is_integer_dtype(data['likes']): raise ValueError(f"Column {column} does not have integer type.") return average_likes(*args, **kwargs) # <- See change
-
wrapper(posts)becomes:if not pandas.api.types.is_integer_dtype(posts['likes']): raise ValueError(f"Column {column} does not have integer type.") average_likes(posts) # <- See change
This flow can be viewed compactly as,
data_column_has_int_type('likes')(average_likes)(data)
decorator(average_likes)(data) # column='likes'
wrapper(data) # column='likes', function=average_likes
https://artemrudenko.wordpress.com/2013/04/15/python-why-you-need-to-use-wraps-with-decorators/