Happy Birthday! What’d I get you for your birthday? Ummm I got…. a fun experiment. Yeah yeah, let’s do a thought experiment, shall we? Close your eyes… no really go ahead and close them, I’ll watch your cake so no one takes it… I promise.
Imagine creating a brand new table with 5 columns — something like below.
CREATE TABLE YourCakeIsSafe
(Year INT,
CakeFlavor CHAR(10),
SecondsNotLooking INT,
HowGoodDoesItTaste (VARCHAR(MAX)),
ToEatOrNotToEat (BIT))
Now imagine inserting a row into this table for each year you’ve had a birthday (don’t worry I’m not going to ask how many rows would be in the table!). Finally, take a few seconds to think about all the rows of data and the fields in each row — imagine what it must look like within SQL. Now open your eyes and tell me what that table looked like. I bet I can tell you what you were imagining and, what’s more, would you believe it if I told you that you are likely wrong?
When most people envision their tables they imagine a structure with rows and columns. After all, that’s how its represented when we see it in the results pane if we were to run a SELECT statement. I bet in our example you were imagining all of the cakes you have had in the exact order you ate them from year one to year <mumble mumble> with one row for each year’s cake. I further wager you are envisioning the data stored with each field in the same order as you wrote the CREATE TABLE statement. How’d I do at my guess?
Table Order
A heap table (a table without a clustered index), such as in our example above, has no guaranteed order. Also, query results (SELECT * FROM YourCakeIsSafe) has no guaranteed order. This is one of the first mistakes many database developers make when they are starting out. They will be confused when they run a query and the results are in a specific order and then when they do a join it may completely flip the order — due to a merge join most likely.
The only way you can guarantee the order in the actual table itself is to place a clustered index on the table. Why is that beneficial? When looking up just a single cake from a specific year, it would be much easier for Yoda to know exactly where in the table to go to, rather than having to scan through all 900 rows.
When it comes to a query result, the only way to guarantee a specific order is with an ORDER BY clause. SQL is a set-based language – think of every table, view, or any record set as just a jumbled-up grouping of data that could be in any order. This allows SQL to figure out the best way to get what you want.
Field Order
SQL is storing the fields in a completely different order than you created them in. To make the most efficient use of space, it stores all the fixed width fields in the front of the record and then the variable width fields at the end. This means in our example above the HowGoodDoesItTaste field is actually at the end of the record. This is because it doesn’t know how long the data may be in that field.
Think of it this way, if SQL stored it in the order we listed and we INSERTed a row with a value for that field as “Good”, but then as we thought more about it we decided to UPDATE that to “So amazing, it’s like a piece of heaven!”. What would happen to ToEatOrNotToEat? It would need to get pushed further out to make room for all the extra characters you’ve added to the other field. Also, don’t forget, each record can only be 8k (actually less because of overhead bit masking) at most, so there may also be pointers to other pages if your record becomes too big.
Leave a comment below if you correctly envisioned how it is stored/returned. If not, how did you envision it?
