Header Logo

Lessons From My Mistakes: Start With MySQL

These are real issues that I walked myself right into by starting my coding career with NoSQL instead of MySQL

July 26, 2017-5 min read

If you’re about to begin writing your first application you might be tempted to start with NoSQL. These are real issues that I walked myself right into by doing just that.

Plenty has been written on the differences between these two. Definitely read more than just this. But if you are just starting out there are two fundamental things about working with NoSQL that could cost you significant time.

This is not to say that NoSQL shouldn’t ever be used. Not at all, I use it everyday for very important pieces of my applications. But I shouldn’t have started with it.

1First and foremost, NoSQL does not have a visual database editor like MySQL Workbench. In it you can test every query before you write the query into your code to see if it returns, alters, or deletes what you want.

This is exactly what your code string will look like when you need to make an actual query in your code. Once you type it into the the editor and hit run it sends back the results.

This goes for any type of query. Not just retrievals. You can alter tables, add columns, remove rows, everything¹, right from the Workbench interface. That type of power will dramatically decrease the amount of time you spend debugging and arranging your database. 

On the other hand NoSQL requires you to test queries through an environment that you are responsible for setting up from scratch. Your IDE.

With NoSQL every time you want to make sure your database query is doing what you need you’ll have to hand write some code, save the file, run it , and then check the unformatted output.

That might not seem like it would take a lot of time but if you do it a couple hundred times during one project you’ll add at least a few hours to your development time. This leads to what I call trial-and-error fatigue. Hesitation to try something out or make something better because of the turn-around time in working out the kinks.

2Second, lets assume that you have more than 1,000 items in a database table. If you don’t know exactly which item you need, running an operation that searches through your whole NoSQL table to find it will be expensive in terms of memory and time.

That type of operation is called a scan. If you’re reading this and thinking that you will be doing a lot of that, NoSQL will be a world of hurt. Same thing goes if you need to retrieve more than 1,000 items at a time.

Even if you don’t foresee many scans, it’s going to take some time to get your database feet under you. While you’re learning you’ll make some mistakes that you’ll feel compelled to solve by making scans like this…

Find all restaurants whose type = sushi or pizza

This is a fine query to run on MySQL. Not so much on NoSQL. It can be done up to a certain point but it’s not what it’s for.

Instead, NoSQL is built for situations where you know exactly which item you want.

Get restaurant 004488

Take a messaging application for example. A messaging application is a good use for a NoSQL database. However, if it’s your first time building one you might run into some unexpected problems. 

In a messaging application you will always know the IDs of the participants (sarah101, jill102), and the message thread (229900). So you can always query for specific items by their ID.

You could get the exact thread for jill102 and sarah101:

Get thread 229900

Or if we don’t know what the thread is for jill102 and sarah101’s conversation you could get the items for jill102 and sarah101 and then compare the threads that they are are a part of:

Get user jill102
returns... {user: jill102, displayName: JillPickle, email: jill102@mail.com threads: [229900, 112284, 030399]}

Get user sarah101
returns... {user: sarah101, displayName: SarahBear, email: sarah101@mail.com threads: [229900, 887765, 2004123]}

Get thread 229900

But in a NoSQL database you should not get all the threads that jill102 and sarah101 are apart of by scanning your entire thread table like this:

Find all threads whose users are jill102 and sarah101

For instance, a few months into your messaging application you decide to let users search through the text in their threads. Considering that sarah101 and jill102 message each other at least 50 times a day, this is a huge problem for NoSQL.

If Jill wants to find a message from Sarah about the rock climbing trip a couple months ago you would have to search through all the messages in thread 002299:

Find all messages in thread 002299 that contain "rock climbing"

That query is going to look at everyone of the thousands of message in their thread.

If it’s a predictive search it gets even worse since you’ll be firing off that query with each key stroke:

Find all messages that contain "r"
Find all messages that contain "ro"
Find all messages that contain "roc"
Find all messages that contain "rock"
Find all messages that contain "rocki"

Either the query will return poor results. Or, your database will have burned through enough resources to cause problems for all users².

This isn’t an uncommon type of situation when you’re  starting out. It’s part of the learning curve. You won’t be able to anticipate the need to store something that’s necessary for a feature that you haven’t thought of yet. Then down the road when you want to add a new feature you’ll end up running a scan where you shouldn’t (or can’t) because it’s a catch-all. An expensive, and unreliable, catch-all.

The Workbench³ interface cannot be overstated in terms of value. It will save you time and help you have a better understanding of what’s going on in your database. Add to that the flexibility of MySQL and you’ll be setup to focus on the important aspects of your project as opposed to the headache inducing details of how to work around the constraints of your database.

¹Everything. You can check your schema performance, store procedures, create backups. You can do everything a MySQL database can do right from Workbench.

²Possibilities in this scenario include, but are certainly not limited to:

  1. Unsaved/unsent/unreceived messages
  2. A never ending spinner

³There are others as well. Sequel Pro is a popular choice.

MySQL Resources

Amazon Web Services RDS (Relational Database System)
Using MySQL with Node.js

NoSQL Resources

Amazon Web Services DynamoDBDynamoDB BasicsNPM dynamodb wrapper

© 2019 Phil Andrews