Wednesday, July 16, 2008

Debugging

Recently I have been trying to get OpenID working on a website. I want to use the JanRain OpenID PHP library, and it looked like the EasyOpenID extension to it would make things easier, so I wanted to use that as well.

My first few attempts with them seemed to have various setup issues (the blame for which rests solely with me, I concede), so I decided to remove everything and start fresh. I remembered at some point one of the issues being about getting the proper include path, so the first thing I did was put all of the .php files in the same directory. I also re-did the 'include' and 'require_once' lines to accommodate this, and removed the initial bits of the example OpenID consumer code that set up the php include path. I figured this would at least eliminate one of the problems, and if I got it all working from here, it should be feasible to undo this step.

So, the index.php loaded correctly and asked for my OpenID url. I entered one, and was sent to try_auth.php, where after a few seconds I was presented with a "500 Internal Server Error". Hmm. This is code other people have written and published. People who I expect blow me out of the water, programming-wise, so something strange is going on. Nevertheless, I tried tracking down the error. The only way I knew to do this was to put lots of error_log lines into the program to trace execution, and update my php.ini on the server to write errors to a log file. Reloading the try_auth.php file (with my url specified as a GET parameter) I'd still get the 500 error, and now I could track what was going on by also reloading the error log file. Slow and tedious, but it seemed to work.

When I finally tracked down the last line of the code that was executing, I was fairly surprised where it was. It was in JanRain's BigMath.php, and the actual line that was failing was a call to bcpowmod, a built-in PHP function (part of the encryption/decryption process, if I understand correctly). Since bcpowmod is only available in newer versions of PHP, BigMath.php has it's own powmod function, which gets called if bcpowmod is not available. So I changed the code to skip bcpowmod, and use the powmod function in BigMath. Still died. The powmod function is basically a single while loop, so I stuck a counter in to see how much it was looping. Somewhere among 500-600 iterations of the loop, the code would fail. The precise number of times was different each time, but it was always somewhere in that range. In the process I found the bcpowmod was also getting called at an earlier stage, where it finished without issue.

Ok, take a step back. Maybe it was something screwy with the OpenID provider I had picked. So I tried another one. No dice. When I finally decided to have some friends try, they also had the same error, and looking at the error log, it was in the same place. Ah, maybe wrap the offending code in a try/catch block. No dice. No exception thrown.

Well, maybe there was actually something strange with bcpowmod. So I had the code print out the values it was calling bcpowmod with (nice long integers) in the error log. I copied those into a little local script whose only job was to call bcpowmod. Worked fine. Ok, upload that to the server. Worked fine there. Hmm.

Perhaps the server was cutting the script off at a certain amount of time. I seem to recall, in reading Google App Engine documentation (not what I'm using, but still), that code had only a few seconds to return before it was interrupted, for performance reasons. Perhaps that's how the server I'm on is set up. So I found that php.ini contains those sort of limiting parameters, both execution time and memory usage. Comparing with my local php.ini I noticed the memory usage allowance was a little lower on the server, so I tried temporarily up-ing it. Didn't work. The execution time allowance was set at the default 300 (seconds), and the script was dying after about 6, so I figured that wasn't it. All the same, to test it, I changed the allowed time to about 5 seconds, to see what would happen when the script was forced to quit early. For grins, I also told the script to pause for a few seconds using a usleep() call. This gave me a different error message, so I decided that wasn't the problem.

So, now I'm out of ideas. I contacted the hosting company by submitting a support request online. Tracking the work log, it looks like they've taken a look (at something, my notice at least, hopefully the issue itself), but it's now been 3 days (2 since they last checked it out). Makes me feel like this is at least a little bit of a justifiable puzzler.

Anybody have any ideas? If this gets resolved sometime, I'll let you know how. Perhaps it's just time to start over fresh? Maybe without the EasyOpenID bit? Just get something, anything, working?

No comments: