This article will dive into the principles of algorithm design. If you haven't a clue what I'm referring to, read on!
When you hear the word "algorithm," you probably respond in one of three ways:
- You immediately know and understand what we're talking about because you studied computer science.
- You know that algorithms are the workhorses of companies like Google and Facebook, but you aren't really sure what the word means.
- You run and hide in fear because everything you know about algorithms reminds you of high-school Calculus nightmares.
If you are one of the second two, this article is for you.
What is an Algorithm, Exactly?
Algorithms are not a special type of operation, necessarily. They are conceptual, a set of steps that you take in code to reach a specific goal.
Algorithms have been commonly defined in simple terms as "instructions for completing a task". They've also been called "recipes". In The Social Network, an algorithm is what Zuckerberg needed to make Facemash work. If you saw the movie, you probably remember seeing what looked like a scribbly equation on a window in Mark's dorm room. But what does that scribbly algebra have to do with Mark's simple "hot or not" site?
Algorithms are indeed instructions. Perhaps a more accurate description would be that algorithms are patterns for completing a task in an efficient way. Zuckerberg's Facemash was a voting site to determine someone's attractiveness relative to a whole group of people, but the user would only be given options between two people. Mark Zuckerberg needed an algorithm that decided which people to match up to one another, and how to value a vote relative to that person's previous history and previous contenders. This required more intuition than simply counting votes for each person.
For instance, let's say you wanted to create an algorithm for adding 1 to any negative number, and subtracting 1 from any positive number, and doing nothing to 0. You might do something like this (in JavaScript-esque pseudo code):
function addOrSubtractOne(number){ if (number < 0) { return number + 1 } else if (number < 0) { return number - 1 } else if (number == 0) { return 0; } }
You may be saying to yourself, "That's a function." And you're right. Algorithms are not a special type of operation, necessarily. They are conceptual - a set of steps that you take in code to reach a specific goal.
So why are they a big deal? Clearly, adding or subtracting 1 to a number is a fairly simple thing to do.
But let's talk for a second about searching. To search for a number in an array of numbers, how would you think to do it? A naive approach would be to iterate the number, checking each number against the one you're searching for. But this isn't an efficient solution, and has a very wide range of possible completion times, making it an erratic and unreliable search method when scaled to large search sets.
function naiveSearch(needle, haystack){ for (var i = 0; i < haystack.length; i++){ if (haystack[i] == needle) { return needle; } } return false; }
Fortunately, we can do better than this for search.
Why is it Inefficient?
There is no better way to become a better algorithm designer than to have a deep understanding and appreciation for algorithms.
Let's say your array has 50,000 entries, and you brute-force search (that is, search by iterating the full array). The entry you are searching for, in the best case scenario, will be the first entry in the 50,000-entry array. In the worst case scenario, however, the algorithm will take 50,000 times longer to complete than in the best case scenario.
So what's Better?
Instead, you would search using binary search. This involves sorting the array (which I will let you learn about on your own) and subsequently dividing the array in half, and checking to see if the search number is greater or less than the halfway mark in the array. If it is greater than the halfway mark of a sorted array, then we know that the first half can be discarded, as the searched number isn't a part of the array. We can also cut out a lot of work by defining the outer bounds of the array and checking to see if the searched number exists outside of those bounds, and if so, we have taken what would have been a multiple-iteration operation and turned it into a single iteration operation (which in the brute-force algorithm would have taken 50,000 operations).
sortedHaystack = recursiveSort(haystack); function bSearch(needle, sortedHaystack, firstIteration){ if (firstIteration){ if (needle > sortedHaystack.last || needle < sortedHaystack.first){ return false; } } if (haystack.length == 2){ if (needle == haystack[0]) { return haystack[0]; } else { return haystack[1]; } } if (needle < haystack[haystack.length/2]){ bSearch(needle, haystack[0..haystack.length/2 -1], false); } else { bSearch(needle, haystack[haystack.length/2..haystack.length], false); } }
Sounds Fairly Complicated
Take the seemingly complicated nature of a single binary search algorithm, and apply it to billions of possible links (as searching through Google). Beyond that, let's apply some sort of ranking system to those linked searches to give an order of response pages. Better yet, apply some kind of seemingly randomized "suggestion" system based on artificial intelligence social models designed to identify who you might want to add as a friend.
This gives us a much clearer understanding of why algorithms are more than just a fancy name for functions. At their finest, they are clever, efficient ways of doing something that requires a higher level of intuition than the most apparent solution. They can take what might would require a supercomputer years to do and turn it into a task that finishes in seconds on a mobile phone.
How do Algorithms Apply to Me?
For most of us as developers, we aren't designing high-level abstracted algorithms on a daily basis.
Luckily, we stand on the shoulders of the developers who came before us, who wrote native sort functions and allow us to search strings for substrings with indexOf in an efficient manner.
But we DO, however, deal with algorithms of our own. We create for
loops and write functions every day; so how can good algorithm design principles inform the writing of these functions?
Know Your Input
One of the main principles of algorithmic design is to, if possible, build your algorithm in such a way that the input itself does some of the work for you. For instance, if you know that your input is always going to be numbers, you do not need to have exceptions/checks for strings, or coerce your values into numbers. If you know that your DOM element is the same every time in a for
loop in JavaScript, you shouldn't be querying for that element in every iteration. On the same token, in your for
loops, you shouldn't use convenience functions with overhead if you can accomplish the same thing using (closer to) simple operations.
// don't do this: for (var i = 1000; i > 0; i--){ $("#foo").append("<span>bar</span>"); } // do this instead var foo = $("#foo"); var s = ""; for(var i = 1000; i > 0; i--){ s += "<span>bar</span>"; } foo.append(s);
If you are a JavaScript developer (and you use jQuery) and you don't know what the above functions are doing and how they are significantly different, the next point is for you.
Understand Your Tools
At their finest, [algorithms] are clever, efficient ways of doing something that requires a higher level of intuition than the most apparent solution.
It is easy to think that this goes without saying. However, there is a difference between "knowing how to write jQuery" and "understanding jQuery". Understanding your tools means that you understand what each line of code does, both immediately (the return value of a function or the effect of a method) and implicitly (how much overhead is associated with running a library function, or which is the most efficient method for concatenating a string). To write great algorithms, it is important to know the performance of lower-level functions or utilities, not just the name and implementation of them.
Understand the Environment
Designing efficient algorithms is a full-engagement undertaking. Beyond understanding your tools as a standalone piece, you must also understand the way that they interact with the larger system at hand. For instance, to understand JavaScript in a specific application entirely, it is important to understand the DOM and performance of JavaScript in cross browser scenarios, how available memory affects rendering speeds, the structure of servers (and their responses) you may be interacting with, as well as a myriad of other considerations that are intangible, such as usage scenarios.
Reducing the Workload
In general, the goal of algorithm design is to complete a job in fewer steps. (There are some exceptions, such as Bcrypt hashing.) When you write your code, take into consideration all of the simple operations the computer is taking to reach the goal. Here is a simple checklist to get started on a path to more efficient algorithm design:
- Use language features to reduce operations (variable caching, chaining, etc).
- Reduce iterative loop nesting as much as possible.
- Define variables outside of loops when possible.
- Use automatic loop indexing (if available) instead of manual indexing.
- Use clever reduction techniques, such as recursive divide and conquer and query optimization, to minimize the size of recursive processes.
Study Advanced Techniques
There is no better way to become a better algorithm designer than to have a deep understanding and appreciation for algorithms.
- Take an hour or two every week and read The Art of Computer Programming.
- Try a Facebook Programming Challenge or a Google Codejam.
- Learn to solve the same problem with different algorithmic techniques.
- Challenge yourself by implementing built in functions of a language, like
.sort()
, with lower level operations.
Conclusion
If you didn't know what an algorithm was at the start of this article, hopefully, now, you have a more concrete understanding of the somewhat elusive term. As professional developers, it is important that we understand that the code we write can be analyzed and optimized, and it is important that we take the time to do this analysis of the performance of our code.
Any fun algorithm practice problems you've found? Perhaps a dynamic programming "knapsack problem", or "drunken walk"? Or maybe you know of some best practices of recursion in Ruby that differ from the same functions implemented in Python. Share them in the comments!
Comments