Oct 30, 2006

A Scenario to Ponder #7

A totally hypothetical and insane requirement, incontestably prone to excessive criticism (to put it mildly) from Joe Celko, if he ever happens to visit my blog :)

But, lets just assume that the requirement is warranted and we need a solution in SQL Server 2000.
I have a table with this definition.

CREATE TABLE BAD_ENCRYPTION(VAL_ID INT, VALUE VARCHAR(100))


And the following sample data.

INSERT INTO BAD_ENCRYPTION VALUES(1,'OAMBNRITBAUQZYZZ')
INSERT INTO BAD_ENCRYPTION VALUES(2,'SAQWLR')
INSERT INTO BAD_ENCRYPTION VALUES(3,'GZAXRCBVABGNEM')
INSERT INTO BAD_ENCRYPTION VALUES(4,'CAOWLELYEHCGTHOLR@')


If, you didn't guess the encryption strategy, the four rows will have the value OMNIBUZZ, SQL, GARBAGE, COLLECTOR if you knock off the letters in the even numbered positions.
Now, I need a query that will accept the unencrypted word and find the corresponding ID. Say, I search for GARBAGE, I need to get the value 3.


Assumptions:
  • Lets say there are about 50,000 rows.
  • And my search will result in only one row.
  • I can change my table design, if necessary
  • I cannot have a computed column having the actual value (then the whole point of this rudimentary encryption is lost).
How will I go about getting the data. What is my best approach?

12 comments:

  1. pretty easy I would say

    select *
    from BAD_ENCRYPTION
    where VALUE like '%g_a_r_b_a_g_e%'

    ReplyDelete
  2. Fair Enough Denis. Though I was looking at how you would form 'g_a_r_b_a_g_e_' from the input 'garbage' without a while loop (A set based solution).

    As an extension, can you think of a way to can get me the decrypted values for all the rows by knocking of the alternating characters?

    ReplyDelete
  3. Still pretty easy. ;)

    1:
    declare @mystring varchar(20)
    set @mystring = 'GARBAGE'

    select * from bad_encryption
    where value like
    (select substring(@mystring, n.num, 1) + '_'
    from dbo.nums n
    where n.num <= len(@mystring)
    for xml path(''))


    2:
    select *,
    (
    select substring(e.value,n.num,1)
    from dbo.nums n
    where n.num <= len(e.value)
    and n.num % 2 = 1
    for xml path('')
    ) as decrypted
    from bad_encryption e

    I'm sure there's a better way of using "for xml path('')" which will allow grouping... this does work though...

    ReplyDelete
  4. Rob,
    This will work for SQL Server 2005. We can even use a CLR function to get it.

    But I need the solution for SQL Server 2000.

    I guess we cannot do the decryption in SQL Server 2000 without a UDF having a while loop.

    But, can you find the equivalent for your first query in SQL Server 2000?

    ReplyDelete
  5. Oh whoops... missed that bit.

    ReplyDelete
  6. Here's a decryption mechanism for you:

    --alter table bad_encryption add decrypted varchar(50)

    update bad_encryption
    set decrypted = left(value,1)

    declare @length int
    select @length = max(len(value)) from bad_encryption
    declare @working int
    set @working = 3
    while @working < @length
    begin
    update bad_encryption
    set decrypted = decrypted + substring(value,@working,1)

    set @working = @working + 2
    end

    select *
    from bad_encryption


    As for finding 'G_A_R_B_A_G_E', how about:

    declare @mystring varchar(100)
    set @mystring = 'GARBAGE'
    declare @len int
    set @len = len(@mystring)
    declare @n int
    set @n = 1
    while @n < @len
    begin
    set @mystring = stuff(@mystring,@n*2,0,'_')
    set @n = @n + 1
    end
    set @mystring = @mystring + '_'
    select @mystring

    ReplyDelete
  7. Rob, you are still using a while loop for getting 'G_A_R_B_A_G_E'.
    Though I am not abl to find a better solution to decrypt the data (I am still trying hard to figure out one), how about this for searching?

    declare @a varchar(100)
    set @a = 'garbage'

    select val_id,value
    from bad_encryption a, numbers b
    where
    b.num <= len(@a) and
    ascii(upper(right(a.value,b.num*2))) = ascii(upper(right(@a,num)))
    group by val_id, value
    having count(*) = len(@a)

    I am using a numbers table here.
    Let me know your thoughts.

    ReplyDelete
  8. I'm not sure exactly what you're trying to do, but if you flip the problem around a bit, you can at least get good performance.

    Create a deterministic Encryption Function such as:

    CREATE FUNCTION ENCRYPT (@word AS varchar(50))
    RETURNS varchar(100)
    AS
    BEGIN
    DECLARE @pos INT
    SET @pos = LEN(@word)
    SET @word = @word + ' '
    WHILE @pos>0
    BEGIN
    SET @word = STUFF(@word, @pos+1,0, CHAR((((ASCII(UPPER(SUBSTRING(@word,@pos,1)))-ASCII('A')) + 10) % 26)+ASCII('A')))
    SET @pos= @pos - 1
    END
    RETURN RTRIM(@word)
    END

    Now you can populate the table and when you get the value, the value being searched for will be constant and an index seek will find the record. The only looping cost is to construct the encrypted value being searched for, but it's only created once.

    CREATE TABLE #BAD_ENCRYPTION(VAL_ID INT, VALUE VARCHAR(100))

    INSERT INTO #BAD_ENCRYPTION VALUES(1,dbo.ENCRYPT('OMNIBUS'))
    INSERT INTO #BAD_ENCRYPTION VALUES(2,dbo.ENCRYPT('SQL'))
    INSERT INTO #BAD_ENCRYPTION VALUES(3,dbo.ENCRYPT('GARBAGE'))
    INSERT INTO #BAD_ENCRYPTION VALUES(4,dbo.ENCRYPT('COLLECTOR'))
    CREATE UNIQUE INDEX #BAD_ENCRYPTION_VALUE ON #BAD_ENCRYPTION (VALUE)

    SELECT * FROM #BAD_ENCRYPTION WHERE VALUE=dbo.ENCRYPT('SQL')

    DROP TABLE #BAD_ENCRYPTION

    Larry

    ReplyDelete
  9. Hi Larry,
    Nice suggestion. If I start with putting the data that way, then I would accept that what you suggested would be the best approach. I am looking at selecting data encrypted with a partially deterministic function. I have no control over what the alternate characters will be. By the way, its not a real time scenario, just a scenario about decrypting the data with limited knowledge on the encryption. Thanks for spending your time in reading through this and coming up with an answer. I am writing these scenarios to find out different approaches to a problem.

    And regarding my approach, I am trying to compare every chraracter from the search string to every second character from each column. This I am doing it by using a numbers table.Try creating a table called NUMBERS with a column called NUM. Insert into it numbers from 1 to 50.

    Now execute the query, removing the group by clause and its fairly straight forward on how it works.

    ReplyDelete
  10. Here's a performant, set-based solution:

    http://blogs.technet.com/wardpond/archive/2006/11/12/database-programming-omnibuzz-steals-a-saturday-night.aspx

    ReplyDelete
  11. Here's a SQL Server 2000-based decrypter without a cursor..

    http://blogs.technet.com/wardpond/archive/2006/11/21/database-programming-a-decrypter-for-omni.aspx

    ReplyDelete
  12. Here is a solution for decrypting the value in SQL Server 2000. This will not work if the encrypted string is too large. But, its a solution anyways :)


    create function fn1(@Input varchar(50),@len int)
    returns varchar(25)
    as
    begin
    return (left(@input,1) + case when @len < 2 then '' else dbo.fn1(right(@input,@len-2),@len-2) end)
    end

    --select query here
    select *,dbo.fn1(value,len(value)) as DecryptedString from dbo.Bad_Encryption

    ReplyDelete