Oftentimes, the way in which we write code depends on how we got started with programming.
For example, if someone has a formal education in computing, they are likely to know a variety of principles that will prevent them from writing poor quality code. This isn't a rule, of course, but an observation.
Similarly, those who get into programming on their own, outside of a formal setting, often end up teaching themselves and learning from a variety of different resources. Sometimes, this can cause problems with the type of code that's written.
To be absolutely clear: I am not saying that a formal education in computing trumps someone who teaches themselves, nor vice versa. Some fantastic programmers are those who are educated; others are those who are self-taught. But the one thing both have in common with one another is that they are not exempt from writing code smells from time to time.
In this article, we're going to take an introductory look at code smells. We're going to examine what they are, what they look like, and how they often manifest themselves in the work we do. We'll be using PHP for our examples.
In the second part of this series, we're going to take a look at how we can avoid writing code smells. Specifically, we'll be using PHP, a couple of tools, and WordPress as the environment of choice.
But first, let's have an introduction to code smells.
What Are Code Smells?
Depending on the type of developer you ask, you'll likely get a variation on the following definition:
Code smell, also known as a bad smell, in computer programming code, refers to any symptom in the source code of a program that possibly indicates a deeper problem.
This definition is straight from Wikipedia. It's not bad, but it's not my favorite take on the topic. Instead, my favorite definition comes from a prolific and popular programmer named Martin Fowler. You can read his entire take on the topic of code smells in this article, but I've distilled it down to the following points:
- "A long method is a good example of this - just looking at the code and my nose twitches if I see more than a dozen lines of Java."
- "Smells aren't inherently bad on their own - they are often an indicator of a problem rather than the problem themselves."
- "Data classes (classes with all data and no behavior) are good examples."
If you haven't read the article, he ends with this:
One of the nice things about smells is that it's easy for inexperienced people to spot them, even if they don't know enough to evaluate if there's a real problem or to correct them.
And I think that's a fantastic statement because it means this topic is positioned perfectly for those who are programmers but aren't sure where to start when it comes to identifying and addressing code smells.
In short, experience doesn't play a large role in this. Sure, those who are more experienced are likely to more easily identify smells (because they've seen more of them), but less experienced developers should be able to identify them.
Do You Have Examples?
Of course, when it comes to a topic like code smells, it's often easier to talk about them at an abstract level that it is to actually do something about them. But that's not practical, nor is it applicable to our day-to-day work.
With that said, why not take a look at some example of code smells? We'll examine why they are problematic and then offer a solution for how the smell could be removed by refactoring the code.
Example 1: Clear Naming Conventions
One of the easiest code smells to spot is when a programmer has opted to use unclear variable names. That is, in the context of the code, it's not clear as to what a given variable is supposed to represent.
Sure, there are times where this is acceptable (like using an i
in a for
loop). But in a longer method, it's not quite the same as for
loop).
For example:
<?php public function get_things( $x ) { $l = array(); for ( $i = 0; $i < count( $x ); $i++ ) { if ( true === $x[ $i ] ) { array_push( $l, $x[ $i ] ); } } return $l; }
Given enough time, we'd likely be able to figure out what this is doing. First, it's a relatively simple method. Second, we'd be able to read through the variables, blocks, and return
value to gain a better idea as to what's happening.
But if we're looking to write clean code that's easier to understand as we refer back to it or another programmer works with it, then using clear naming conventions always helps. And the above code does not use clear naming conventions.
Instead, let's refactor it so that it looks like this:
<?php public function get_flagged_items( $items ) { $flagged_items = array(); for ( $i = 0; $i < count( $items ); $i++ ) { $current_item = $items[ $i ] ; if ( true === $current_item ) { array_push( $flagged_items, $current_item ); } } return $flagged_items; }
Much easier to understand, isn't it?
Notice that the algorithm itself hasn't changed, but the function name and the variable names have. This has made that much of a difference in reading this code. If asked what this code is supposed to be doing, we can easily say something like:
Returns an array of items marked as true from a predefined array of items.
As much as possible, avoid using generic variable names and use whatever is the clearest option for you and your peers.
Example 2: Stay DRY
In programming, you would be hard-pressed to find a developer who had not heard of KISS or DRY (you know, "don't repeat yourself"). Despite this, we often do repeat ourselves.
This is indicative of the fact that attempting to adhere to the DRY principles lands differently with different types of programmers. And that's fine! There is no single way to demonstrate how to adhere to this principle.
But because there are multiple ways to do so, we can provide an example of what it should not look like and what it should look like.
Assume for the purposes of the following example that we have a function called save_post and it accepts two arguments: a post ID and a string representing the title of the post. A code smell would look something like this:
<?php public function save_posts() { save_post( 1, 'Hello World!' ); save_post( 2, 'Goodbye World!' ); save_post( 3, 'What is this new world?' ); }
But why would we manually type out a call to save_post
three times? Instead, let's set up an associative array, iterate through it, and then call the method once per iteration.
<?php public function save_posts() { $posts = [ 1 => 'Hello World!', 2 => 'Goodbye World!', 3 => 'What is this new world?', ]; foreach ( $post as $post_id => $post_title ) { save_post( $post_id, $post_title ); } }
Though calling the method once is nice, the method could be made even more flexible by having it accept the array of posts as an argument and leaving the foreach
loop intact, though that's not really the point of this example.
If you find yourself making the same method call multiple times in a function but with different parameters, you may have a code smell. And if you do, then look for ways to refactor it so that you're calling the method only once.
After all, you don't want to repeat yourself.
Example 3: Long Parameter Lists
One of the more common things that we continue to see in programming, regardless of the language, is when a function accepts a large number of parameters.
Different programmers will have opinions on what's the ideal number of parameters a function should accept, but I tend to think three (give or take two, maybe) is a good number.
First, let's take a look at what a function with a long parameter list would look like. There are likely no surprises in the following code, and you may be dealing with something exactly like this in one of your current projects:
<?php public function submit_order( $first, $last, $address1, $address2, $city, $state, $zip, $phone, $cc, $expiration ) { /* Attempt to submit the order. If the order is successful, * then return an instance of an Order object with the success status; * otherwise, return an instance of an Order object with the failed status * and a message. */ }
Notice in the above example we aren't concerned with the implementation of the function. Instead, we're concerned with how many parameters it requires. That's a lot of information to send, and it will make the method call that much uglier, as well.
It doesn't even hit on the topic of verification and validation. But I digress on that.
How could this one be cleaned up? Personally, I'm a fan of creating classes to represent collections of information like this. For this particular example, we could have a class that presents a person's contact information. Furthermore, that person could be associated with a credit card number.
The details of this could be enforced using business logic elsewhere in the application, but the abstraction would look something like this:
<?php class Contact_Information { /* Maintains attributes of the person's contact information. */ } class Payment_Information { /* Maintains the credit card number and other information for a person. */ } class Order { public function submit( $contact_info, $payment_info ) { /* Attempt to submit the order. If the order is successful, * then return an instance of an Order object with the success status; * otherwise, return an instance of an Order object with the failed status * and a message. */ } }
This refactoring, although small, is the largest we've done in this particular article. Notice that we've done the following things:
- Created a
Contact_Information
class that allows us to instantiate an object that includes all payment information for a person. - Created a
Payment_Information
class that allows us to maintain the credit card or debit card number for a person as well as other details associated with that payment method. - Created an
Order
class, placed thesubmit_order
function inside of it, renamed it to submit (sincesubmit_order
would be redundant), and decreased its parameter list to two values: an instance of theContact_Information
class and thePayment_Information
class).
To be clear, this example does not handle the verification of association between contact information and payment information, nor does it show other classes that may be necessary (such as whether or not payment failed for the transaction).
But that's not the point of the exercise.
Instead, we're looking at the code smell of long parameter lists and how we can decrease them using practical, more maintainable methods available to us.
Whenever you're writing a function or making a call to a function that requires a large number of arguments, look for ways to refactor that function. It will make the cohesion of the code increase and the smell decrease.
Conclusion
Remember, the examples that we've looked at above are just that. The list is by no means complete, but they are common smells that you're likely to see in code with which you work or even code you write. I'll be the first to admit that I'm guilty of this.
Additionally, there are many, many resources that are available when it comes to identifying and fixing code smells. Luckily, we also have a number of tools at our disposal that will help us discover them automatically and clean them up.
And that's where we're headed next. Specifically, we're going to use PHP CodeSniffer in order to help us avoid code smells in our code. Then, we're going to see how to incorporate the WordPress rules into PHP CodeSniffer and hook it up to our IDE of choice.
As mentioned earlier in this article, the next article in the series will focus more on code smells when writing code for WordPress. We'll take a look at some tools and resources available to help make sure we're avoiding code smells, and to help make sure we're writing stronger code.
In the meantime, study the examples above and check out the resources that I've provided as they are great places to learn more about smells and code refactoring from people and places that are notable within our industry.
Remember, you can catch all of my courses and tutorials on my profile page, and you can follow me on my blog and/or Twitter at @tommcfarlin where I talk about various software development practices and how we can employ them in WordPress.
Please don't hesitate to leave any questions or comments in the feed below, and I'll aim to respond to each of them.
Comments