This article contains a brief discussion on python functions. In any programming language, be it Python, R, Scala or anything else, functions play a very important role. Data science projects require some repetitive tasks to perform every time to filter the raw data and while data preprocessing. In this case, functions are the best friend of a data scientist. They save them from doing the same task every time by simply calling the relevant function.
Functions, both inbuilt and user-defined are a very basic yet critical component in any programming language and python is no exception. And here is a brief idea about them, so that you can start using the benefit they provide.
Why use Python for data science? Python is the most favourite language among data enthusiasts. One of the reason is Python is very easy to understand and code with compare to any other language.
Besides, there are lots of libraries from third parties which make data science tasks a lot easier. Libraries like Pandas, NumPy, Scikit-Learn, Matplotlib, seaborn all contain numerous modules catering almost all kind of function you wish to perform in data science. Libraries like Tensorflow, Keras are specially designed for deep learning applications.
Please read these articles about the use of Python in Machine Learning and Deep Learning to know more about the use of Python in data science.
If you are a beginner or you have some basic ideas about coding in other programming languages, this article will help you get into python functions as well as creating a new one. I will discuss here some important Python functions, writing your own functions for repetitive tasks, handling Pandas data structure with easy examples.
Like other objects of python like integer, string and other data types function are also considered as the first-class citizen in python. They can be dynamically created, destroyed, defined in other functions, passed as arguments in other functions, returned as values etc.
Particularly if we consider the field of data science, we need to perform several mathematical operations and pass on calculated values further. So, the role of python functions is very crucial in data science to perform any particular repetitive calculation, as nested function, to be used as argument of another function etc.
So without much ado, lets jump into details of it and some really interesting use of function with examples.
Use of Python functions for data science
Page contents
Using functions is of utmost importance not only in Python but in any programming language. Be it inbuilt function or user-defined functions you should have a clear idea how to use them. Functions are very powerful to make your coding well structured and increases its usability.
Some functions are there in Python, we just need to call these built in functions to perform the assigned tasks. Most of the basic tasks we need to do frequently in data operations are well covered in these functions. To start with I will discuss some of these important built in python functions.
Built in python functions
Let’s start with some important inbuilt functions of Python. These are already included and makes your coding experience much smoother. The only condition is you have to aware of them and frequently use them. The first function we will discuss is help().
So take help()
Python functions take care of most of the tasks we want to perform through coding. But the common question comes into any beginner’s mind is how will he/she know about all these functions?. The answer is to take help.
The help function is there in Python to tell you every detail about any functions you need to know to use them. You just need to mention the function with help. See the example below.
# Using help
help(print)
Here I want to know about the print function, so I mentioned it within the help. Now see the help describes everything you need to know to apply the function. The function header with optional arguments you need to pass, their role. It also contains a brief description of the function, what it does in English.
Interestingly you can know all about the help() function using the help function itself :). It is great to see the output. Please type to see it yourself.
# Using help() for help
help(help)
Again here help has produced all necessary details about itself. It says that help() function is actually a wrapper around pydoc.help that provides a helpful message for the user when he types “help” in the Python interactive prompt.
List() function
A list is a collection of objects of same or different data types. It has very frequent use in storing data and later used for operations in data science. See the below code to create a list with different data types.
# Defining the list item
list_example=["Python", 10, [1,2], {4,5,6}]
# Printing the data type
print(type(list_example))
# Printing the list
print(list_example)
# Using append function to add items
list_example.append(["new item added"])
print(list_example)
Above code creates a list with a string, a digit, array and set. The type function to print the type of data. And at last, the append() function used to add an extra item in the list. Let’s see the output.
So, the data type is list. All the list items are printed. And an item is appended in the list with append() function. Note this function as it is very handy while performing data analysis. You can also create a complete list from scratch only using the append() function, see the below example.
sorted() function
This is also an important function we need frequently while doing numeric computation. For example a very basic use of sort() is while calculating the median of a sample data. To find out the median, we need to sort the data first. By default the function sort() arrange the data in ascending order, but you can do the reverse also by using the reverse argument. See the example below.
# Example of sorted function
list_new=[5,2,1,6,7,4,9]
# Sorting and printing the list
print("Sorting in an ascending order:",sorted(list_new))
# Soritng the list in descending order and printing
print("Sorting in an descending order:",sorted(list_new,reverse=True))
And the output of the above code is as below:
round() function
This function is useful to give you numbers with desired decimal places. The required decimal place is to be passed as an argument. These decimal number has some unique properties. See the below example and try to guess what will be the output, it is really interesting.
# Example of round() function
print(round(37234.154))
print(round(37234.154,2))
print(round(37234.154,1))
print(round(37234.154,-2))
print(round(37234.154,-3))
Can you guess the output. See the second argument can be negative also!. Lets see the output and then explain what the function does to a number.
When the round() function has no argument, it simply discards any decimal digits. It keeps up to two decimals if the argument is 2 and one decimal when it is 1. Now when the second argument is -2 or -3, it simply returns the closest integer with multiple of 100 or 1000.
If you are surprised where on the earth such a feature is useful; then let me tell you that there are some occasions like mentioning a big amount (money, distance, population etc) where we don’t need an exact figure, rather a rounded close number can do the job. In such cases to make the figure easier to remember, round() function with a negative argument is used.
Now there are a lot more in-built functions, we will touch them in other articles. Here as an example I have covered few of them. Lets move on to the next section of user-defined function. It gives you freedom to create your own functions.
User defined functions
After inbuilt functions, here we will learn about user defined functions. If you are learning Python as your first programming language, then I should tell you that functions in any programming language are the most effective as well as an interesting part.
Any coder’s expertise depends on how skilled he is in creating functions to automate the repetitive tasks. Instead of writing code for the same tasks again and again a skilled programmer writes some function for those tasks and just call them when the need arises.
Below is an example how can you create a function of adding two numbers.
# An example of user defined function
def add (x,y):
''' This is a function to add two numbers'''
total=x+y
print("The sum of x and y is:", total)
The above is an example of creating a function which will add two numbers and then print the output. Let’s call the function to add two numbers and see the result.
I have called the function, passed two digits as arguments and the user-defined function printed the result of adding the numbers. Now anytime I will need to add two numbers I can just call this function instead of writing those few lines again and again.
Now if we want to use help for this function, what will help return? Lets see
See help() function has returned the text I have put within three quoted strings. It is called the docstring. A docstring allows us to describe the use of the function. It is very helpful as complex programmes require a lot of user-defined functions. The function name should indicate its use but many a time it may not enough. In such cases, a brief docstring is very helpful to quickly remind you about the function.
Optional arguments in user-defined function
Sometimes providing an optional argument with the default argument save us writing additional lines. See the following example:
# Defining functions
def hi(Hello="World"):
print ("Hello",Hello)
hi()
hi("Python")
hi()
Can you guess the output of the following function calls? Just for fun try without seeing the below output. While trying notice that once the function has been called with an optional argument.
Here is the output.
See for the first call of the function, it has printed the default argument. But when we passed “python” as an optional argument, it has overridden the default argument. Again in the third case without any optional argument, the default gets printed. You should try any other combinations come in your mind, it is complete fun and also your concept will get clear.
Nested functions
Nested functions are when you define functions inside another function. This is also one of the very important python basics for data science. Below is an example of a very simple nested function. Try it yourself to check the output.
# Example of nested functions
def outer_function(msg):
# This is the outer function
def inner_function():
print(msg)
# Calling the inner function
inner_function()
# Calling the outer function
outer_function("Hello world")
Functions passed as argument of another function
Functions can also be passed as an argument of another function. It may sound a little confusing at first. But it is really a very powerful property among the python basics utilities for data science. First, take an example to discuss it. See the below piece of code to check the property.
# Calling functions in a function
def add(x):
return 5+x
def call(fn, arg):
return (fn(arg))
def call_twice(fn, arg):
return(fn (fn(arg)))
print(
call(add, 5),
call_twice(add, 5),
sep="\n"
)
Again you try to understand the logic and guess the output. Copy the code and make little changes to see the change or error it produces. The output I got from this code as below.
Did you guess it right? See here we have created three functions namely add(), call() and call_twice(). And then passed the add() function into other two functions. The call() function has returned the add function with argument 5 so the output is 10.
In a similar fashion, the call_twice() function has returned 15 due to the fact that it has a return statement with a nested function and argument combination. I know it is confusing to some extent. This is because the logic has not come from a purpose. When you will create such functions to really solve some problem the concept will get clear. So, do some practice with the code given here.