doors, choices, choose, open, decision, opportunity, choosing, career, option, entrance, challenge, way, decide, strategy, doorway, select, alternative, future, freedom, life, entering, finding, chance, exit, solution, confusion, labyrinth, complexity, search, goal, confused, puzzle, risk, courage, maze, problem, hard, game, wise, complex, white, creativity, doors, doors, doors, choices, choose, decision, decision, opportunity, opportunity, opportunity, career, career, career, career, career, challenge, strategy, strategy, future, future, future, future, future, solution, solution, confusion, confusion, labyrinth, labyrinth, search, search, goal, goal, confused, confused, confused, puzzle, puzzle, risk, risk, maze, maze, maze, problem

Clearly defining columns in SQL queries

Everyone has a pattern or writing style when it come to SQL I’ve noticed. Some people use uppercase for keywords other people don’t, some start a column on a new line others don’t, people format differently, leaving commas at the end of the line vs at the beginning and many more niche ways people write. Which is correct ? It really depends on what is easiest to read and maintain for the writer but we can all agree that writing a long line of SQL code with no space or line breaks is just not very helpful to anyone at all.

Whatever your writing style is, just ensure it’s “clean” somewhat as I couldn’t tell you if there’s a right or wrong way overall. using ctrl+F7 on SQL developer will format a query into a readable format. (Maybe we should all write like what SQL developer formats to). However I once had a really stressful query to maintain and the experience of that changed one thing about the way I write and I thought I’d share that here.

WHAT HAPPENED?

I was once asked to make some changes to a view and the experience was rather unsavory. The view was written by a different developer who had before I joined the team, since moved on from the job to working barstools somewhere in the caribbean, and I had the unfortunate displeasure of making changes to this rather troublesome SQL. Because I cannot actually share the exact query on here I will give you an idea of how it was written, pretty mundane but still can cause some grief.

THE QUERY

consider the sample HR schema and a query to extract some employee details written as such;

SQL> SELECT
  2         employee_id
  3        ,first_name
  4        ,last_name
  5        ,job_title
  6        ,department_name
  7        ,salary
  8        ,hire_date
  9        ,phone_number
 10        ,city
 11        ,state_province
 12        ,country_name
 13        ,region_name
 14  FROM
 15        employees
 16  JOIN  jobs USING (job_id)
 17  JOIN  departments USING(department_id)
 18  JOIN  locations USING(location_id)
 19  JOIN  countries USING(country_id)
 20  JOIN  regions   USING(region_id)
 21* WHERE department_name = 'IT';

   EMPLOYEE_ID    FIRST_NAME    LAST_NAME        JOB_TITLE    DEPARTMENT_NAME    SALARY    HIRE_DATE    PHONE_NUMBER         CITY    STATE_PROVINCE                COUNTRY_NAME    REGION_NAME
______________ _____________ ____________ ________________ __________________ _________ ____________ _______________ ____________ _________________ ___________________________ ______________
           103 Alexander     Hunold       IT Specialist    IT                      9000 03-JAN-90    590.423.4567    Southlake    Texas             United States of America    Americas
           104 Bruce         Ernst        IT Specialist    IT                      6000 21-MAY-91    590.423.4568    Southlake    Texas             United States of America    Americas
           105 David         Austin       IT Specialist    IT                      4800 25-JUN-97    590.423.4569    Southlake    Texas             United States of America    Americas
           106 Valli         Pataballa    IT Specialist    IT                      4800 05-FEB-98    590.423.4560    Southlake    Texas             United States of America    Americas
           107 Diana         Lorentz      IT Specialist    IT                      4200 07-FEB-99    590.423.5567    Southlake    Texas             United States of America    Americas

Now the above query works fine but the problem is can you tell from looking at the above query what table the state_province column is coming from?

The Issue

Whilst this schema (if you’ve ever used it before) is not that hard to figure out and probably you can guess with some ease where the columns are coming from and really from any other schema/query it’s possible with some digging into the data dictionary to figure out where the columns may be coming from, the whole point is we can write our query such that one day when we are trying out different islands and enjoying sunsets, the poor people who have to maintain our code will have an easier time at it. Not only that the columns are undefined, the join types are not explicit and while we can tell that those all default to inner joins, there’s a possibility that logically that should not have been the right join choice but rather it’s just a matter of chance that the results end up being correct. Be kind, write explicit and self defining SQL code.

A Better Way

Once more I am not saying everyone should write SQL like me or in a particular manner (everyone has their quirks) but put some effort to make it readable and understandable for the next guy. unless you plan to live forever and stay at your job forever, then it doesn’t really matter how you write… actually it still does cause sometimes I read the stuff I wrote a couple of months ago and I’m going “what the heck is going on here?”

A better version of the query above could be as such

SQL> SELECT
  2         e.employee_id
  3        ,e.first_name
  4        ,e.last_name
  5        ,j.job_title
  6        ,d.department_name
  7        ,e.salary
  8        ,e.hire_date
  9        ,e.phone_number
 10        ,l.city
 11        ,l.state_province
 12        ,c.country_name
 13        ,r.region_name
 14  FROM
 15              employees   e
 16  INNER JOIN  jobs        j   ON e.job_id        = j.job_id
 17  INNER JOIN  departments d   ON e.department_id = d.department_id
 18  INNER JOIN  locations   l   ON d.location_id   = l.location_id
 19  INNER JOIN  countries   c   ON l.country_id    = c.country_id
 20  INNER JOIN  regions     r   ON c.region_id     = r.region_id
 21* WHERE department_name = 'IT' ;

   EMPLOYEE_ID    FIRST_NAME    LAST_NAME        JOB_TITLE    DEPARTMENT_NAME    SALARY    HIRE_DATE    PHONE_NUMBER         CITY    STATE_PROVINCE                COUNTRY_NAME    REGION_NAME
______________ _____________ ____________ ________________ __________________ _________ ____________ _______________ ____________ _________________ ___________________________ ______________
           103 Alexander     Hunold       IT Specialist    IT                      9000 03-JAN-90    590.423.4567    Southlake    Texas             United States of America    Americas
           104 Bruce         Ernst        IT Specialist    IT                      6000 21-MAY-91    590.423.4568    Southlake    Texas             United States of America    Americas
           105 David         Austin       IT Specialist    IT                      4800 25-JUN-97    590.423.4569    Southlake    Texas             United States of America    Americas
           106 Valli         Pataballa    IT Specialist    IT                      4800 05-FEB-98    590.423.4560    Southlake    Texas             United States of America    Americas
           107 Diana         Lorentz      IT Specialist    IT                      4200 07-FEB-99    590.423.5567    Southlake    Texas             United States of America    Americas

Now I don’t know about you, But I would much rather fix this query than the earlier one. I also always try to keep my queries this way and anytime I notice it, I try to plead on people to at least put an alias before the column name.

CONCLUSION

This short post talked about the difficulty that could arise from writing queries that are not explicitly defined or self explanatory so to speak. As a matter of fact the query I had to actually fix had 50+ columns coming from 5 different tables. It was the day of my dreams… but nightmares are dreams too! Some comments in the code would have helped at least but I had nothing to work with. It was a very long and unjoyful day at the office. Hopefully this post brought some awareness that you yourself or someone else will have to maintain what you write today. write it such that when that day comes you will not have to scratch your head. The world is hard enough as it is we need less problems and SQL should not be one of them, rather it should be only a solution to some of our problems.

Cheers, Harris.

Leave a Comment

Your email address will not be published. Required fields are marked *