Phelio Gnomi

My words from my mind

Category Archives: Python

Reduce SQL Injection Risk in Python and psycopg2

It will be surprising that a slight different in your line of code can have a great impact in preventing SQL injection. Try to spot the difference between these 2 lines below.

# code 1 (with python string format() function)
db.execute("select * from some_table where filter = '{}'".format("going to Sam's party"))
# code 2 (with psycopg2 sql prepare syntax)
db.execute("select * from some_table where filter = %s", ("going to Sam's party"))

It probably looks obviously different, but for a second it looks like it shouldn’t give much different result. But sadly it does.

The first code use the python standard string formatting feature where given a string containing curly brackets as a placeholders like “This is a {}”, and with the format() method, it will fill those placeholders with other strings.

# Example
sample1 = "this is a {}, and also a {}".format("pen", "weapon")
print(sample1)
 
#results
# this is a pen, and also a weapon 

this looks fine for now. But try to do one for the string that we pass into the db.execute() above. If we print the string, it will give you the result below

select * from some_table where filter = 'going to Sam's party'

Notice the extra single quote on the filter? This will cause error and of course opening a whole world of opportunity for SQL injection attack. With the single quote unescaped, the rest of the string can be injected with other commands that will cause serious maintenance headache.

-- example: imagine the replacer string is "bleh'; drop table some_table; insert into user values ('some new malicious users'); --"
-- your query will become
select * from some_table where filter = 'bleh'; drop table some_table; insert into user values ('some new malicious users'); --'
-- note that double dash (--) is used for commenting. So the extra single quote will be ignored.

So, why does code 2 is a better way of string replacements? Because it has built in special character escaping mechanism in which all strings that are passed thorough this method will remain as a string instead of becoming a malicious codes.

db.execute("select * from some_table where filter = %s", ("bleh'; drop table some_table; insert into user values ('some new malicious users'); --"))
the code above will produce sql below

 

select * from some_table where filter = E'bleh\'; drop table some_table; insert into user values (\'some new malicious users\'); --'
Advertisements

From Unicorn to Unicode

What is worse than knowing that Unicorn exists in some other dimensions but you will never be able to see it?

My answer will be the xA0 character from some encoding world that I don’t even know to exist. Being an Earthling, the only encoding world I’ve been and known is the Unicode. More specifically the UTF-8 realm.

Interestingly, many Unicode based systems reject the xA0 (or any nonconvertible characters) and totally crashes the system. Take Python for example, and also PostgreSQL later on.

Python

In Python, there is a function call unicode() that convert a string from other encoding to Unicode.

unicode(object[encoding[errors]])

However, the “errors” handling is defaulted to “strict”. It means that it will complain that “Something is wrong” whenever there is an error. Basically it means that it will break the system when there is an untranslatable character in the object that you are trying to convert.

There are two other options in handling conversion errors.

  • “replace” to replace the untranslatable character to the official Unicode replacement character
  • “ignore” basically replace the untranslatable character with an empty string.

PostgreSQL

When inserting non Unicode strings into an UTF-8 (Unicode based) databases, PostgreSQL will try to translate them first. Same thing will happen if the said string contain an untranslatable character, it will throw you an error.

This can be a hell of a problem because it technically break your system if your system is a one of those systems that process input and save them into a database.

So the solution is usually to try to catch these unicorns before they escaped into the database.