Use Excel? Automate and speed up the boring stuff with Python!

The more familiar you get with code and databases the more you cringe when you hear that the data is in Excel. As soon as I got the hang of databases I would try my best to avoid excel at all costs. It’s slow, takes up a lot of memory for what it’s storing, gets pushed to the limit quicker and struggles to handle complex data relationships. However, there is no avoiding it. The main reason for this is that you don’t have to know any code to use Excel, and there are loads of people who do not know or need to know how to code collecting data and doing audits.

 

screen-shot-2016-09-22-at-17-50-38
This makes me cringe every time

 

My shove to step up to the plate came when an ED consultant emailed me asking for help. She had some information on people who were going on a diving trip with her in an Excel file. She wanted me to code a small script that went through the file and provided a small text file which contained the person’s parameters from the data in the Excel spreadsheet. The script also had to save the text file under the name of the person. This is why I love Python. Within 1 hour I had coded a program that did the job. It wasn’t pretty and it wasn’t compiled but it did the job. We could have 100s of rows in the Excel file and it would rip through them within seconds. So how did I do it so quickly? Is this is testament to my intelligence? Sadly no, it’s a testament to high-level programming languages and Python. All I had to do was the following:

 

screen-shot-2016-09-22-at-17-53-34
This is all the code that’s needed to open an excel file and rip through the data!

First of all, I imported the function: open_workbook from the xlrd module which is free and easy to download. I then utilize the open_workbook function with the parameter: “Medical_info.xlsx” which is the name of the file I want to open. There! I have got python to open the Excel file! Yes, it’s that simple! I then use a for loop to loop through the workbook sheets and then use another for loop to loop through the rows in the sheet. As for the count forget it, you don’t need it. I’m increasing the count every loop to make a note of how many records have been created. For each row, I’m looping through I assign column 0’s value to name, column 1’s value to email, and column 2’s value to mobile. Now you can do whatever you want with these variables. You can have conditional statements and get the computer to count how many rows have a certain value or are within a certain range, write particular rows to other files, write them to word files, heck you can even do machine learning if you know how. For me, I wrote another 25 lines of code to get the computer to write reports for each row and save them as text files.

If you want to do stats or count particular encounters 30 minutes of basic Python training will let you do this. I have seen so many junior doctors burn hours going through Excel files. However, they would have ended up saving time if they typed in the 6 lines of code at the start and learned some basic commands to count and categorize the rows of data. After the 30 to 45mins of training their computer would process all the data within seconds. Not only would they save time there and then, they would be walking away with a new skill that would save them time in the future when they’re doing another audit. Who knows, they may even build on this skill. Considering how Excel files are used in hospitals the only reason why python is not being used by doctors, nurses, and managers is that they are not aware of how easy, quick and powerful it is….. Plus it’s free. So have a go and spread the word. Feel free to contact me. If your project is a little specific I will email you back. It isn’t then due to the volume of emails it will be addressed in the next Bryan podcast.

 

Leave a Reply