Rule to validate SQL syntaxis

Hi! We have received a request to create a new rule on CachéQuality to identify when a developer uses double quotes (" ") within any SQL statement.

We have been asked many times about SQL validation rules, and we would like to open a debate to allow everyone discuss what would you like to be checked on a SQL statement.

Current examples are for basic situations:

  • Using SQL.Statement class:

Set stmt = ##CLASS(%SQL.Statement).%New()
Set query = "Select Val1, Val2 FROM Table WHERE Val1=""SomeCondition"""

  • Using embedded SQL

&SQL(SELECT Val1, Val2
                INTO :val1, :val2
                FROM Table
                WHERE Val1="SomeCondition")

  • Ideally these should be :

Set stmt = ##CLASS(%SQL.Statement).%New()
Set query = "Select Val1, Val2 FROM Table WHERE Val1='SomeCondition'"
 
&SQL(SELECT Val1, Val2
                INTO :val1, :val2
                FROM Table
                WHERE Val1='SomeCondition')

 

All your feedback/explanations/requests are welcome!!

  • 0
  • 0
  • 179
  • 10
  • 3

Answers

Another check. Properties can't be SQL reserved words. I often name properties "Date", etc. only to forget that they can't be used in SQL as is only quoted, so I need to go to Class view and rename them to something else.

So here we need a "Reserved word used as SQL property" rule, and this is a issue for readability and a best practice. Isn't it?

Yes. To get a full list of reserved works execute:

zw ^%qCacheSQL("reservewords")

Check that after &SQL(SELECT ...) value of SQLCODE is checked.

+1 to Alexander. 

And I propose the new Warning rule for this:

Check for if SQLCODE after every &SQL() before the next &SQL() or before the end of the method.

It would be really useful if you can attach some examples about your "query" construction, so we can identify the different situations to be able to build proper rules.

Comments

If we talking about some InterSystems Community-wide coding guidelines my vote is for the second variant cause it's more readable.

First, I would not use the input values directly in the query text, but would use Input Host Variables
Second, if it is necessary, it is more elegant to use the following syntax:

Set stmt ##CLASS(%SQL.Statement).%New()
Set query $$$FormatText("Select Val1, Val2 FROM Table WHERE Val1=%1",$$$quote("SomeCondition"))
Ideally these should be :
&SQL(SELECT Val1, Val2
INTO :val1, :val2
FROM Table
WHERE Val1='SomeCondition')

This is and now works:

&SQL(SELECT Val1Val2
               INTO :val1:val2
               FROM "Table"
               WHERE "Val1"='SomeCondition')

The main check we need is parametrization. SQL should not be concattenated from user input, but user input should be passed as an argument.

I think you propose a "Possible SQL injection" rule, to detect when sql is created from strings concatenation without proper parametrization.

This is less about a community wide coding guideline and more about the validity and helpfulness of a code smell being added within CachéQuality.

In the OP the examples are basic but provide code that executes and just fine even though the code/syntax could be much better.

The code smell that Dainel is speaking of would identify areas where this, less than ideal, syntax is used so it can either be prevented or worked on as tech debt.

If the above code existed and there was a business need to enable support for delimited identifiers there would be problems.

This rule may break into multiple rules, one to identity the use of double quotes in embedded SQL, one for the use in SQL Statements.  Then also as Eduard suggested, a rule for parametrization as a whole.

Thoughts?