The community is growing. I have recently been teaching code at my London Python coding for medics group. And I have been engrossed in the conversations of other clinical developers who have come out the woodwork. What’s come up in the conversations and in the teaching sessions is how to logically solve problems. Regardless of what language we code in, a good programmer will be able to develop an algorithm that’s quick, simple, and solves the problem. This shouldn’t be surprising. The skill of a good mathematician is displayed in expressing complex relationships by simple equations. A good wordsmith can express a lot in a few sentences. It goes to say that a skilled programmer can solve a problem with fewer lines of code than a beginner. For an introduction, I use the drug-sorting problem.
Someone gives you a list of drugs. The list is messy and it has repetitions. It’s clear that this is raw data pulled from a database. Your boss wants you to go through the data and calculate how many combinations of two drugs you can make from the drugs in the data. These have to be combinations. You cannot have two drugs that are the same in the combination.
This raises a few problems. You have to be good with the concepts of loops. You also have to understand what a cache is, and you have to be able to filter results. In order to make our solution quicker and simple, we have to filter out the duplicates in the data and store them in a cache. This means that we won’t be repeating work on the same drug combinations when going through the data. Once we have refined the list we can loop through the refined list finding and counting the number of combinations. This is easier said than done but I recommend thinking through the problem in order to come up with a solution in your language before reading on. Exercising your problem-solving muscles is how you’re going to improve your ability to solve problems with code and increase your coding value.
The example code here is in python but the logic can transfer to any other language. Here we create a cache called refine_list, which is simply an empty list. This is where we are going to store one of each drug in the drug_list. We do this by looping through the drug_list and setting up a conditional. If the subject of the loop is not in the refine_list then we append it to the refine_list. If not we pass. When the loops comes across a drug that is already in the refine_list we will simply pass it because it is already in the cache (refine_list) so there will be no duplicates:
Now we have the refined data we want to loop through it to find the combinations and count them. First of all, we define the count and set it to zero. We then loop through the refined data. Here comes the twist. We have another loop within our first loop. This means that for every drug we highlight in the first loop we then loop through the refined data again. However, we know that if both drugs are the same we cannot make a combination, so we have to make a conditional. As we loop through the refined data for each drug we have to check. If the drug in the first loop is the same as the drug in the second loop we pass. If not we then increase the count by one and print out the combination that can be made. Once we’ve finished the two loops we can then print the count number in order to see the user how many combinations can be used.
There are many more things you can do with this. Let’s say our boss runs in and tells us that drug A cannot be mixed with drug B! No worries with your thinking cap on this can be changed with 4 lines of code:
Once you’ve got your head around this you’re well on your way to solving data problems that couldn’t be done in excel….. in fact, you can do that now!