Tuesday, March 31, 2009

Jack’s First SQL Query

Jack is just about 6 months old now. As anticipated, he has expressed an interest in programming. He was sitting on my lap yesterday as I was tweaking a query. He started to pound on the keyboard, so I opened up a fresh window for him so that he could express his programming desires uninhibited by my existing work.

Here is what he came up with

“ ."

That’s 2 spaces, a period, and another space. Not only should you appreciate the query itself, but you should appreciate the manner in which it was written. Some people have a vague idea of what they are doing and they sit down to figure it out through a cycle of research/trial-and-error. Others know what has to be done and simply do it; Solving the problem takes exactly as long as it takes to type in the solution. I am of the latter classification, and it would seem that whichever gene enables that skill has been passed to the offspring. Jack was Picasso and the keyboard was his tapestry; There was no delay or thought, simply action. It was so natural that a casual observer may have perceived it as nothing more than the random flailing of his 2 topmost limbs.

As a dad I proudly exclaim that this is a tremendous victory. While not completely void of issues, it is an excellent first step in solving many well known SQL puzzles. I love the initiative, the attitude, and the overall spirit of the effort. It takes more than skill to be a programmer; you need to love it. For that he gets an A+.

But, as a software architect and a mentor, I must be fair and point out the very rare opportunities for improvement.

Excessive white spaces – surely we don’t need spaces on both sides of the period. Perhaps he should consider a tab rather than consecutive leading spaces. I was going to suggest this to him, but he chose that moment to engage in massive crap. He was concentrating very hard on the pushing exercise, with his brow furrowed and his face turning red from the effort. Relative to his intestinal action and diaper trauma, the tab issue seemed trivial so was left unsaid.
The period – in this context, it doesn’t actually do anything, which is OK. I see that the period is a solution; its just that we don’t understand the problem because we're dumb. I only mention it here because he didn’t comment it.

Cleary Jack is on the road to programming greatness.

Sunday, March 29, 2009

XQuery – How ye disappoint

I’m updating a legacy app from VB6 to ASP and then to .NET. The ASP is a transitional step so that I can stop dealing with COM+ objects.

Anyhoo… I picked one of the common pages. It prints a grid, basically.

page calls a vbscript function
the vbscript function executes a query and gets back a recordset
a series of nested loops covert the recordset to xml
the xml is returned to the page
the page uses XSLT to convert the XML to HTML

Sweet.

Now that I’m converting it to .NET, though, I wanted to try exciting new possibilities. The application is on SQL 2000, but updating to 2005 is a reasonable expectation. (I won’t push my luck with 2008).

The Intent

The idea is to transform this data

into this xml

In the current app, that transformation is done in ASP VbScript.

SQL XML

I’ve dabbled with some of the XML capabilities in 2005. I’ve used it to join to tables and to shred the xml. I’ve also used it to create xml documents without that pesky !TAG! syntax. But, they were all meager efforts.

I started by hoping that such meagerness would be sufficient.

        1: declare @startDate datetime

       2: declare @endDate datetime

       3: declare @theaterId int

       4:  

       5: select

       6:     @startDate = '10/31/2003',

       7:     @endDate = '11/6/2003',

       8:     @theaterId = 170

       9:  

      10:     select 

      11:         v.FilmId "film/@film-id",

      12:         v.FilmName "film/@film-name",

      13:         v.PrintId "film/print/@print-id",

      14:         dates.[Date] "film/print/date/@date",

      15:         a.AuditoriumName "film/print/date/auditorium/@auditorium-name",

      16:         a.AuditoriumId "film/print/date/auditorium/@auditorium-id"

      17:     from

But its not. That built the the hierarchy, but it repeats itself over and over. It doesn’t group itself the way I need. If there was/is a way to do everything I need by specifying the paths like that, then it would be a good day.

SQL XQUERY

Next, I started dabbling with XQUERY. My only XQUERY experience has been via SQL Server 2005, and in its simplest form.

I went Google-Crazy and read up on some stuff. I was able to write a query (albeit a crappy one) that does the job.

       1: select @output.query('

       2:     <theater>

       3:         {

       4:             for $filmId in distinct-values(/theater/film/@film-id)

       5:             return 

       6:             <film>

       7:                 { attribute id { $filmId }}

       8:                 { attribute name { /theater/film [@film-id = $filmId][1]/@film-name }}

       9:                 {

                    for $printId in distinct-values(/theater/film [@film-id=$filmId]/@print-id)

                        order by $printId

                        return 

                        <print>

                            { attribute id { $printId }}

                            {

                                for $date in distinct-values(/theater/film [@film-id=$filmId and @print-id=$printId]/@date)

                                    order by $date

                                    return 

                                    <date>

                                        {attribute date { $date }}

                                        {

                                            for $auditoriumId in distinct-values(/theater/film [@film-id=$filmId and @print-id = $printId and @date=$date]/@auditorium-id)

                                            return

                                            <auditorium>

                                                { attribute id { $auditoriumId }}

                                                { attribute name { /theater/film [@film-id=$filmId and @print-id = $printId and @date=$date and @auditorium-id=$auditoriumId]/@auditorium-name }}

                                            </auditorium>

                                        }                                            

                                    </date>

                            }

                        </print>

                }

            </film>

        }

    </theater>

')

How does this suck? Let me count the ways

4 levels of nesting. Its not pretty. But, the VbScript has the same layers. (The logic is different, but its just as nested)

Each layer has to go back to the top and work its way back down based on the key information collected thus far

In the loops, I can only order by the loop indexer. For example: Auditorium. I don’t want to sort on “auditorium id”. I want to sort on display order. I can’t, because “auditorium id” is a value, not a node. If it was a node, I’d be able to get to a sibling attribute.

It offers a handy distinct-values, but does not offer a handy distinct-nodes. (there are example how to do distinct-nodes, but the few I’ve seen use the LET statement, which you can’t do in SQL 2005)

What doesn’t suck

Obviously I’m having problems with it, but that may just be due to my staggering 45 minutes of inexperience with it.

I like the syntax of specifying the attributes (shown) and elements (not shown) through the {} syntax

I like that the comments are smiley faces (not shown). (: this is an xquery comment :)

In principle, I like how you can do the layering.

The Problem

I got the XML that I want, but its slow. The SQL XQUERY consists of 2 parts: the query to get the data as xml, and the xquery to transform it the way I’d like it.

The first part comes back instantaneously. The 2nd part takes anywhere from 2 to 16 seconds. One time, it took a minute and 54 seconds?!. Its really inconsistent. I looked at the execution plan multiple times. Every time it says that the first query accounts for 0% of the time, and the 2nd query accounts for 100% of the time.

The legacy app does all it needs to do, including rendering it on the page, in a 1/2 second or less. You don’t even see it happen; you just click the link and the page renders.

I know that my xquery is amateur. If I can rewrite it the way it should be written and try again, maybe the results will be drastically improved. (At least I hope they are.)

Things that Would Help

SQL 2008 supports the LET statement. If I had that in 2005, then I could assign node sets at the various levels, and treat that as the root for that level. Then it wouldn’t have to go to the top of the document every time. (At least, it seems like I’d be able to do that)

If I could do a distinct-nodes instead distinct-values, then as I loop through, I can get the other stuff I need relative to the attribute. IE: $film-id/../@film-name.

Knowledge of XQuery would sure be helpful.

Next Steps / Conclusions

I wanted the source doc to be hierarchical so that its would be an accurate representation of the data. Since the XQuery didn’t work, I may end up doing it in C#. Then, the page will use an XSLT to render it. (I’ll look into using XQUERY to render it, but I don’t think that’s a viable option yet).

I developed the original application starting in 2001. Over the first few years, I spent a lot of time performance testing the quickest way to get the data out of the database and onto a page. I always lean towards XML and XSLT so that you can easily render it different ways. I want to keep it transformable.

Of all the things I tried, the quickest thing has always been:

Run the query and get back a flat dataset

Use code to convert the dataset to xml

Despite the repeating data, and despite the manual conversion, it wins every time.

Things I may try

Convert the SQL XML to my XML via XSLT

Convert the SQL XML to my XML via C# code (the old fashioned way with a new language)

Read more about XQuery to determine how off-target my query really is

Friday, March 27, 2009

Adventures in SCM

After a long hiatus, I am resuming work on JTS. JTS is a theater management system that <a href=’http://muvico.com’>Muvico</a> has been using since 2001ish.

2001. That was years ago. Those were the days of ASP and COM+. .NET was still called ASP+. I was still prefixing everything I did with the letter J.

The application now spans all generations of development technologies from ASP to .NET 3.5 SP1. Swell.

The first big effort is to downgrade all of the COM+ to ASP, because ASP is easier to work with these days. The second big effort will be to selectively convert the modules to the IN-DEVELOPMENT JTS 3 API.

I started researching what I can use for SCM. Ideally, I want it to be an online repository so that it doubles as backup service. (I use MOZY, but the client is dreadfully bad. Its almost unusable at this point. They said they’re rewriting it, but we’ll see.)

My googling lead me here http://www.myversioncontrol.com

For $5/month, I get what I need. Its subversion based, which I never used before. The service is excellent and its very generously priced. I have a security concern about it, though. I sent an email and we’ll see what happens. (it never asks for an encryption key. I’d like to keep their eyes out of my code.)

MyVersionControl recommends 2 SVN clients: RAPID SVD and TORTOISE SVN.

I couldn’t get RAPID to do anything. I got a lot of exceptions. I then tried TORTOISE which is a set of windows explorer shell extensions. That worked out pretty well. It took a little getting used to, though. My SCM exposure has been limited to VSS and StarTeam. This subversion stuff is different.

Finally, I tackled VS integration, which ended up being an easy task. I found this product: http://visualvsn.com. I installed it and started using it. Peace of cake. VisualVSN, like MyVersionControl, offers a 30 day trials.

Everything went well. I spent some time adding some filters to weed out the files that I don’t need to control. I checked everything in. My repository is now 50% full. Now I have 30 days to see if this sense of joy is permanent or fleeting.

Monday, March 2, 2009

Getting back on the ball with the DvdFriend, RSS feeds etc.

Today, I decided that I wanted to get my RSS feeds under control. I’m not very RSS saavy… I just use internet explorer to subscribe to them; nothing fancy. I ventured out to find a good web solution. I came across this little startup called “Google” which has, among other things, a decent reader.

I haven’t used any other readers, so when I say “decent”, its not relative to anything else. Its very functional, like Gmail. And, the user interface isn’t great, like Gmail. But, its pretty neat.

I have to rebuild a list of RSS feeds. If you come across this post, please let me know of your own personal feeds, and any others that you recommend. Please don’t assume that I already have it, even if I should.

Also, I downloaded and am currently using WINDOWS LIVE WRITER. Good stuff. This will greatly improve the readability for people who take offense to typos. (I’m thinking of someone particular. You know who you are.)

In a related story: I spent today getting a lot of stuff organized. I have multiple drives with duplicate code, documents, databases, etc. I’ve sorted through most of that. I also have some VMs:

1 – super secret side project that I dumped because it wasn’t respecting my time

2 – JTS

3 – Other development efforts, including DvdFriend

I’m getting all of those ducks in a line. The VM for #3 exists, but I haven’t setup the DvdFriend stuff yet. I’d like to get that going; I haven’t touched DvdFriend in months, and I’m itching to do some stuff. I started putting some prices and links in this week; Amazon and Netflix still work. Everyone else has changed their html, so the parser isn’t working. Oh well. 2 is better than none.

The last time I worked on DvdFriend, I made progress towards asking TV reviews at the SHOW, SEASON and EPISODE levels. Of course, that was months ago and I have no idea where I left off or how I did it. Lets hope that I can read my own code.

Hamlet Code