NoSQL has been one of the most talked about topics over the past couple of months. This tutorial will introduce you to CouchDB, a NoSQL implementation and teach you how to get started with the platform.
What is NoSQL?
NoSQL is schema free -- you don't need to decide the structure up front.
NoSQL [not only SQL] is a movement towards document stores that do not make use of the relational model. The fundamental paradigm shift is in the way they store data. For example, when you'd need to store data about an invoice, in RDBMS you'd need to distill this information into tables and then use a server-side language to transform this data back into real life objects. On the other hand, in NoSQL, you just store the invoice. NoSQL is schema free, which means you don't need to design your tables and structure up front -- you can simply start storing new values.
Continuing the invoice example, some invoices may include a VAT number, some may not. In a RDBMS, you'd need to tell your table to first accept a VAT number and then that it could possibly be null. In NoSQL, however, you can just store invoices with or without a VAT number -- there is no schema. Keep in mind that NoSQL is not a silver bullet. If your data is truly relational, sticking with your RDBMS would be the right choice.
Querying NoSQL Databases
MapReducing has benefits over SQL queries because the map/reduce task can be distributed among multiple nodes, something not possible in RDBMS.
NoSQL databases use map/reduce to query and index the database. In RDBMS, you run a query joining multiple tables together to first create a pool of data and then the query runs creating a resultset, a subset of the overall data. In NoSQL, you use map/reduce to create a 'view' (similar to a resultset) this view is a subset of the overall data.
Map is essentially extracting data and reduce, data aggregation. The more familiar you are with RDBMS, the more difficult grasping map/reduce will be. MapReducing benefits over SQL queries because the map/reduce task can be distributed among multiple nodes, something not possible in RDBMS. Adding a new record to the database does not always constitute the map/reduce task being completely rerun.
Introducing CouchDB
A few facts about CouchDB that you should know:
- CouchDB is a JSON document-oriented database written in Erlang.
- It is a highly concurrent database designed to be easily replicable, horizontally, across numerous devices and be fault tolerant.
- It is part of the NoSQL generation of databases.
- It is an open source Apache foundation project.
- It allows applications to store JSON documents via its RESTful interface.
- It makes use of map/reduce to index and query the database.
Major Benefits of CouchDB
- JSON Documents - Everything stored in CouchDB boils down to a JSON document.
- RESTful Interface - From creation to replication to data insertion, every management and data task in CouchDB can be done via HTTP.
- N-Master Replication - You can make use of an unlimited amount of 'masters', making for some very interesting replication topologies.
- Built for Offline - CouchDB can replicate to devices (like Android phones) that can go offline and handle data sync for you when the device is back online.
- Replication Filters - You can filter precisely the data you wish to replicate to different nodes.
Putting It All Together
CouchDB is a database designed to run on the internet of today.
CouchDB allows you to write a client side application that talks directly to the Couch without the need for a server side middle layer, significantly reducing development time. With CouchDB, you can easily handle demand by adding more replication nodes with ease. CouchDB allows you to replicate the database to your client and with filters you could even replicate that specific user's data.
Having the database stored locally means your client side application can run with almost no latency. CouchDB will handle the replication to the cloud for you. Your users could access their invoices on their mobile phone and make changes with no noticeable latency, all whilst being offline. When a connection is present and usable, CouchDB will automatically replicate those changes to your cloud CouchDB.
CouchDB is a database designed to run on the internet of today for today's desktop-like applications and the connected devices through which we access the internet.
Step 1 - Installing CouchDB
The easiest way to get CouchDB up and running on your system is to head to CouchOne and download a CouchDB distribution for your OS -- OSX in my case. Download the zip, extract it and drop CouchDBX in my applications folder (instructions for other OS's on CouchOne).
Finally, open CouchDBX.
Step 2 - Welcome to Futon
After CouchDB has started, you should see the Futon control panel in the CouchDBX application. In case you can't, you can access Futon via your browser. Looking at the log, CouchDBX tells us CouchDB was started at http://127.0.0.1:5984/
(may be different on your system). Open a browser and go to http://127.0.0.1:5984/_utils/
and you should see Futon.
Throughout the rest of this tutorial I will be using Futon in Firefox. I'll also have Firebug and the console view open to see all the HTTP requests Futon is sending behind the scenes. This is useful as your application can do everything Futon is doing. Let's go ahead and create a database called mycouchshop
.
CouchDB jQuery Plugin
Futon is actually using a jQuery plugin to interact with CouchDB. You can view that plugin at http://127.0.0.1:5984/_utils/script/jquery.couch.js
(bear in mind your port may be different). This gives you a great example of interacting with CouchDB.
Step 3 - Users in CouchDB
CouchDB, by default, is completely open, giving every user admin rights to the instance and all its databases. This is great for development but obviously bad for production. Let's go ahead and setup an admin. In the bottom right, you will see "Welcome to Admin Party! Everyone is admin! Fix this".
Go ahead and click fix this and give yourself a username and password. This creates an admin account and gives anonymous users access to read and write operations on all the databases, but no configuration privileges.
More on Users
In CouchDB it would be unwise to create a single super user and have that user do all the read/write.
Users in CouchDB can be a little confusing to grasp initially, specially if you're used to creating a single user for your entire application and then managing users yourself within a users table (not the MySQL users table). In CouchDB, it would be unwise to create a single super user and have that user do all the read/write, because if your app is client-side then this super user's credentials will be in plain sight in your JavaScript source code.
CouchDB has user creation and authentication baked in. You can create users with the jQuery plugin using $.couch.signup()
. These essentially become the users of your system. Users are just JSON documents like everything else so you can store any additional attributes you wish like email for example. You can then use groups within CouchDB to control what documents each user has write access to. For example, you can create a database for that user to which they can write to and then add them to a group with read access to the other databases as required.
Step 4 - Creating a Product Document
Now let's create our first document using Futon through the following steps:
- Open the
mycouchshop
database. - Click "New Document".
- Click "Add Field" to begin adding data to the JSON document. Notice how an ID is pre-filled out for you, I would highly advise not changing this. Add key "name" with the value of "Nettuts CouchDB Tutorial One".
- Make sure you click the tick next to each attribute to save it.
- Click "Save Document".
Go up a level, back to the database and you should see one document listed with the previous ID as the key and a value beginning with{rev:
. This is the JSON document you just created.
Step 5 - Updating a Document
CouchDB is an append only database -- new updates are appended to the database and do not overwrite the old version. Each new update to a JSON document with a pre-existing ID will add a new revision. This is what the automatically inserted revision key signifies. Follow the steps below to see this in action:
- Viewing the contents of the
mycouchshop
database, click the only record visible. - Add another attribute with the key "type" and the value "product".
- Hit "Save Document".
After hitting save, a new revision key should be visible starting with the number 2. Going back a level to the mycouchshop
database view, you will still see just one document, this is the latest revision of our product document.
Revisions
While CouchDB uses revisions internally, try to not lean on it too much. The revisions can be cleaned through Futon quite easily and it is not designed to be used as a revision control system. CouchDB uses the revisions as part of its replication functionality.
Step 6 - Creating a Document Using cURL
I've already mentioned that CouchDB uses a RESTful interface and the eagle eyed reader would have noticed Futon using this via the console in Firebug. In case you didn't, let's prove this by inserting a document using cURL via the Terminal.
First, let's create a JSON document with the below contents and save it to the desktop calling the file person.json
.
{ "forename": "Gavin", "surname": "Cooper", "type": "person" }
Next, open the terminal and execute cd ~/Desktop/
putting you in the correct directory and then perform the insert with curl -X POST http://127.0.0.1:5984/mycouchshop/ -d @person.json -H "Content-Type: application/json"
. CouchDB should have returned a JSON document similar to the one below.
{"ok":true,"id":"c6e2f3d7f8d0c91ce7938e9c0800131c","rev":"1-abadd48a09c270047658dbc38dc8a892"}
This is the ID and revision number of the inserted document. CouchDB follows the RESTful convention and thus:
- POST - creates a new record
- GET - reads records
- PUT - updates a record
- DELETE - deletes a record
Step 7 - Viewing All Documents
We can further verify our insert by viewing all the documents in our mycouchshop
database by executing curl -X GET http://127.0.0.1:5984/mycouchshop/_all_docs
.
Step 8 - Creating a Simple Map Function
Viewing all documents is fairly useless in practical terms. What would be more ideal is to view all product documents. Follow the steps below to achieve this:
- Within Futon, click on the view drop down and select "Temporary View".
- This is the map reduce editor within Futon. Copy the code below into the map function.
function (doc) { if (doc.type === "product" && doc.name) { emit(doc.name, doc); } }
- Click run and you should see the single product we added previously.
- Go ahead and make this view permanent by saving it.
After creating this simple map function, we can now request this view and see its contents over HTTP using the following command curl -X GET http://127.0.0.1:5984/mycouchshop/_design/products/_view/products
.
A small thing to notice is how we get the document's ID and revision by default.
Step 9 - Performing a Reduce
To perform a useful reduce, let's add another product to our database and add a price attribute with the value of 1.75 to our first product.
{ "name": "My Product", "price": 2.99, "type": "product" }
For our new view, we will include a reduce as well as a map. First, we need to map defined as below.
function (doc) { if (doc.type === "product" && doc.price) { emit(doc.id, doc.price); } }
The above map function simply checks to see if the inputted document is a product and that it has a price. If these conditions have been met, the products price is emitted. The reduce function is below.
function (keys, prices) { return sum(prices); }
The above function takes the prices and returns the sum using one of CouchDB's built in reduce functions. Make sure you check the reduce option in the top right of the results table as you may otherwise be unable to see the results of the reduce. You may need to do a hard-refresh on the page to view the reduce option
Conclusion
In this tutorial, we took a brief but focused look at CouchDB. We saw the potential power of CouchDB and how easy it is to get started. I'm sure you have plenty of questions at this point so feel free to chime in below. Thank you so much for reading!
Comments