Security on Shared Hosts

Shared hosts are a reality for many small businesses or businesses that aren’t oriented around moving massive amounts of data. This is a given - we can’t all afford racks full of dedicated servers. With that in mind, I would urge people to be more careful about what they do on shared hosting accounts. You should assume that anything you do is being watched. Take, for example, the /tmp directory. I was doing some work for a friend this weekend whose account is housed on the servers of a certain very large hosting company. While tweaking some of his scripts, I noticed via phpinfo() that sessions were file-based and were being stored in /tmp. This made me curious as to whether any of that session data could possibly be available for public viewing. My first move was to simply try FTP’ing up and CD’ing to /tmp directory. No go - they have the FTP accounts chrooted into a jail, so the obvious door is closed. However, the accounts have PHP installed, so I can do something like this in a PHP script: <?php system("ls -al /tmp"); ?> With this little bit of code, I can look into the tmp directory even if my FTP login is chrooted. Fortunately, sessions on this host are 600, so they’re not publically readable -  this was my primary concern and the reason I took some time to check this out. But people are putting lots of things into the tmp directory with the misguided idea that it is their private temporary file dump, including one idiot who put a month’s worth of PayPal transaction data into tmp and left it 644 so that it was publically viewable. Now, I’m a nice guy and the only thing I’m going to do with this information is laugh at it. But keeping in mind how dirt cheap hosting accounts are, there’s not a high entry barrier for someone with fewer scruples. The key thing to remember is that, if you need temporary file storage on a shared host, do it someplace less obvious, set the permissions so that only you can read/write to it (600), and clean up by deleting files as soon as you possibly can.

Installing PECL PS on Mac OS X

The PHP that comes standard with Mac OS X Leopard doesn’t come with the PECL PS extension. PECL PS requires pslib, and the last version I verified to work the PS extension was 0.2.6 (I still have an outstanding bug for that). There’s a minor little bug that prevents it from compiling on OS X, so here are the steps necessary to get PECL PS working on Leopard: Download PSLib 0.2.6. Unpack to somewhere on your filesystem (I use /usr/src) cd pslib-0.2.6/src Apply this patch to pslib.c (patch pslib.c leopard_pslib-0.2.6.patch) cd ../ ./configure make make install By default this puts it in /usr/local/lib. Now install the PS extension using PECL. pecl install ps When it asks for path to pslib installation, /usr/local/lib Once it’s done compiling, add the .so to your php.ini. You may have to move the .so or alter extension_dir in your php.ini. sudo apachectl restart

ngrep and memcache

You can use the Linux command ngrep to “watch” what is going into and coming out of memcache. ngrep is an amazingly useful tool for troubleshooting a wide array of network issues; I previously have used it extensively for troubleshooting SIP errors. In this case, I’m using it to be sure memache sessions in PHP are actually working. codelemur ~ # ngrep -d lo port 11211 interface: lo ( filter: (ip) and ( port 11211 ) #### T -> [AP] get a804f5517468d4696c60da7eaf8a7179.. ## T -> [AP] VALUE a804f5517468d4696c60da7eaf8a7179 0 16..test|s:4:"test";..END.. ## T -> [AP] set a804f5517468d4696c60da7eaf8a7179 0 1440 16..test|s:4:"test";.. # T -> [AP] STORED.. It doesn’t help too much if you have multiple memcache servers (which is kinda the point of memcache), and since it’s raw data you can’t inspect the packets if they’re compressed, but in a testing environment, it’s a great way to be sure all things are kosher.

PHP, PostScript and ATM Fonts

Recently, I’ve been expermenting with PHP’s PS functions - the PECL extension that allows you to directly output PostScript from your scripts. There are other projects that come to mind (html2ps is another one that will render to PostScript) but I wanted somsething more tightly intergrated into my script. Mysteriously, when I went to install my scripts on the new Poweredge I bought, I began to get there strange errrors: ps_findfont() [function.ps-findfont]: PSlib warning: Trying to insert the glyph '.notdef' which already exists. Please check your afm file for duplicate glyph names. I couldn’t understand what was going on - it was working fine on the previous server. After googling about the web and wracking my brains for about two hours, I checked the versions of PSlib  installed on the two servers. Both were masked by Gentoo’s Portage system, but the unmasked version on the previous server was 0.2.6, whereas the one on the new server was 0.4.1. After I masked out 0.4.1 (thanks to Gentoo’s awesome package.mask) and downgraded back to 0.2.6, everything began working again. So there you have it. Apparently the PECL PS extension is not completely compatible with the most recent version of PSlib, and downgrading back seems to work. Hope this helps somebody!

PHP/MySQL in Huntsville/North Alabama

Just to let everyone know, we’re trying to get a little meetup.com group going for those  developers interested in PHP and MySQL in the Huntsville and Northern Alabama regions. I’ll be attending, and I know Brian will be giving mini-talks for the first few meetings. You can visit Brian Moon’s blog for more information.

PHP Templating Celebrity Deathmatch!

Ladies and Gentleman! Welcome to the PHP Templating Celebrity Deathmatch! I actually do like the idea behind templating. I know there are varying arguments about whether or not templating is appropriate for PHP, though those are not the focus of this entry. The big idea behind templating is separation of concerns, that is, breaking a program into parts that are easily manageable and don’t overlap in functionality. In an ideal world, templating would provide the added advantage of allowing a programmer to be a programmer and not a web designer - and allowing a web designer to be a web designer and not a programmer - by keeping the logic underlying the presentation layer to a minimum. However, I have never found this to be true in any project I’ve worked on in my professional career. One of the big benefits, as far as I see, is that it makes code much easier to read. This may not be true for everyone, but I would much rather be confronted with smooth, separated templated code rather than a jumbled PHP mess. It’s easier to read and far, far easier to adapt and change. While I was attending OSCON a few weeks ago, I heard mention of a new PHP templating engine that was written in C and native compiled into a PHP extension. This would make it much, much faster than anything written in PHP itself - in theory. This project, called Blitz, was making some pretty grand claims on their website, so I wanted to put them to the test - at least a small timing test. In this test, I am going to be comparing Smarty (the most widely used PHP templating engine and an official PHP project), Blitz (a new templating engine currently under very active development that is native compiled as a PHP extension), and standard PHP includes. For the purposes of this test, I wrote a quick timing function that uses microtime() to record how much time has elapsed between each call of mark_time(). The code is available in the accompanying project. A Note About The Tests These are not meant to be exhastive tests by any means. These tests are just designed to give you 5,000 foot overview of the current state of PHP templating. They only evaluate page generation time and not other metrics such as CPU load, IO load, or memory usage. Furthermore, I selected three scenarios that I have commonly used in templating; there may be some scenarios that I haven’t tested where one method may outperform others. And, as with any benchmarking, they are dependent on my system - YMMV. Test 1: Instantiation This is a simple test that determines how much time it takes to power up the templating engine and get it loaded into memory for PHP to use. For the purposes of this test, we will just be comparing Smarty and Blitz, as there is no need for instantiation with a standard PHP include. We’ll start with Smarty first. smarty_instantiation.php <?php echo mark_time()."<br>"; include "Smarty.class.php"; $smarty = new Smarty; echo mark_time()."<br>"; ?> Smarty’s instantiation time was 0.0058109760284424 or 0.005 seconds in human terms. blitz_instantiation.php: <?php echo mark_time()."<br>"; $blitz = new Blitz; echo mark_time()."<br>"; ?> Blitz’s instantiation time was 3.0994415283203E-5, or 0.00003 seconds in human terms. It may not seem like a big difference, but this is one area where having Blitz as a PHP extension makes a huge difference over Smarty being written in PHP and included. Because PHP must traverse the include_path to find Smarty.class before including it, it causes PHP to be slowed down before it can even instantate the Smarty object. To be fair, I decided to run a second test again with the include out of the timing mark. smarty_instantiation2.php <?php echo mark_time()."<br>"; $smarty = new Smarty; echo mark_time()."<br>"; ?> Even without having to search the include_path for Smarty, it still took 6.5088272094727E-5, or 0.00007 seconds to instantate the Smarty object - almost twice as long as it took to instantate the Blitz object. However, this is not a realistic scenario in any way - there is no way that PHP can have saved any time and still have access to the Smarty object! Winner: Blitz Test 2: Simple Template Rendering In this test, we will be comparing simple template rendering in Blitz, Smarty and PHP includes. In this test we will create a simple HTML template with two variables that need to be replaced, then render and display them to the user using each engine or, in the case of PHP, straight PHP. So, let’s get started! We’ll run Blitz first, since it won the previous test. blitz_simple_render.php <?php echo mark_time()."<br>"; $blitz = new Blitz('blitz_simple_render.tpl'); echo $blitz->parse(array( 'title' => "Blitz Test!", 'body' => "Blah foo! I'm a body!" )); echo mark_time()."<br>"; ?> Blitz took an impressive 0.00011801719665527, or 0.0001 to render a simple HTML document with two replaces. Smarty’s next: smarty_simple_render.php <?php echo mark_time()."<br>"; include "Smarty.class.php"; $smarty = new Smarty; $smarty->assign('title',"Smarty Test!"); $smarty->assign('body',"Blah foo! I'm a body!"); $smarty->display('smarty_simple_render.tpl'); echo mark_time()."<br>"; ?> Because Smarty is a compiling engine (it compiles the templates to PHP and caches them), the first run is always the most costly - in this case, an atrocious 0.058284997940063 or 0.06. Even on subsequent runs, 0.0065691471099854 or 0.007, again much slower than Blitz. Finally, standard PHP includes: php_simple_render.php <?php echo mark_time()."<br>"; $title="PHP Test!"; $body="Blah foo! I'm a body!"; include "php_simple_render.tpl.php"; echo mark_time()."<br>"; ?> Surprisingly, standard PHP includes took 0.00030016899108887, or 0.0003 seconds, much faster than Smarty, but three times as slow as Blitz. Once again, this likely has to do with PHP having to traverse the include_path before finding the appropriate file. If you specify the _absolute path on the filesystem _to the file above, the time took becomes 0.00010490417480469, or 0.0001, roughly equal to Blitz on any given run. However, because Blitz is able to parse the template with a minimum of fuss whereas I have to explicitly specify the filesystem path for PHP to get equal performance, this round also goes to Blitz. Winner: Blitz Test 3: Complex Templating In this case, we are going to be doing complex templating. This test includes three template-based includes, one foreach loop over an array, and a large array of generated data. Just for the curious, the generation of the data is not going to be counted towards the timing. In this case, we have generated a 10,000 item array and are going to have each engine iterate over it. blitz_complex_render.php <?php echo mark_time()."<br>"; $blitz = new Blitz('blitz_complex_render.tpl'); foreach($arr as $array) { $blitz->block('master_loop',array( 'id' => $array['id'], 'id1' => $array['id+1'] )); } echo $blitz->parse(array( 'title' => "Blitz Complex Render text" )); echo mark_time()."<br>"; ?> Blitz ran the test in 0.072134971618652, or 0.07 seconds, not too shabby considering it had to iterate over a 10,000 item multidimensional array. smarty_complex_render.php <?php echo mark_time()."<br>"; include "Smarty.class.php"; $smarty=new Smarty(); $smarty->assign('title',"Smarty Complex Render test"); $smarty->assign('arr',$arr); $smarty->display('smarty_complex_render.tpl'); echo mark_time()."<br>"; ?> Again, because Smarty is a compiling engine, the first run is always the most expensive - in this case, a whopping 0.31642484664917, or 0.3 seconds. Subsequent runs fell in the range of 0.099456838607788, or 0.1 seconds, three times as fast as the first run but still slower than Blitz. Finally, standard PHP includes: php_complex_render.php <?php echo mark_time()."<br>"; include "php_complex_render.tpl.php"; echo mark_time()."<br>"; ?> In this test, raw PHP includes came in at 0.055343866348267, or 0.06, the fastest of all and yet just a small bit faster than Blitz. Winner: PHP Conclusion Blitz won two of the three tests and came in a close second in the last. Of course, one could argue that PHP “won” the first test since there was no need to be tested on instantiation. Considering the short amount of time Blitz has been under active development, its sheer speed is rather amazing. From a templating standpoint, Blitz is the fastest unless you are willing to jump through lots of little hoops to make standard PHP includes work for you, and even at that point, the performance as far as total page generation time goes is roughly equal, though native PHP may have a slight advantage. However, unfortunately, the very strength of Blitz (it being written in C and compiled into a PHP extension) is its greatest weakness. Because so many websites are served off shared hosts without the ability of users to use external extensions, most of the community will never have the ability to take advantage of Blitz. Only those with access to the machine, or more specifically the php.ini file, will have the ability to use Blitz unless it were to be merged into the PHP tree. Even in the best case, considering how many shared hosts are still running PHP4, I wouldn’t expect to see anything like this soon, if ever. Perversely, the very weakness of Smarty (that it is written in PHP and included) is its strength, for the reasons above. Smarty is the slowest templating engine tested, however because it is just PHP, it can be included and run like any other PHP script - meaning all the people on shared hosting can make use of it with a minimum of fuss. And in Smarty’s defense, there are many features (such as template variable modifiers) Smarty has that are simply not available in Blitz. These features come with the tradeoff of a massive loss in speed. It was honestly surprising to me how slow it was. Ultimately, it is the decision of the programmer as to what is the right method to use. If you want the advantages of templating as far as seperation of concerns and ease of maintenance and you have the ability, Blitz is probably a good choice for you. If you still want the ease of maintenance and separation of concerns provided by templating and are willing to make the tradeoff for a massive loss of speed, Smarty is a possibility also. If sheer pure speed is your primary concern and you’re not willing to make any kind of tradeoffs, going with raw PHP is probably your best option provided you fine tune it a bit to get the absolute best performance out of it.

AGI + PHP: Using PHP to route phone calls!

Hello there! I figure that if I’m going to start using this blog to post the wanderings and wonderings of a mid-level engineer at a dot-com company (I work at dealnews to be specific, and I guess I should include the standard disclosure that my employer does not endose or support anything that I say/do here), perhaps I should give some substance to my first post. So, I figure I would write a post on something I have plenty of experience with: PHP. But what to write about? Surely, there must be ten million PHP tutorials on the ‘net and I don’t need to add to the noise already out there as to what are/aren’t the best practices using PHP, so I thought about using PHP in some lesser known areas. And here is one lesser known, but very cool area: you can use PHP to route phone calls! At a previous employer, I worked with Asterisk as a software development consultant. My primary role was to build web interfaces to Asterisk (and other telecom hardware) backends, though while working as a consultant I learned quite a bit about extending Asterisk to do crazy cool things. “It’s Just Software!” Asterisk is an open-source software PBX that was created by Mark Spencer (an Auburn grad and now CEO at digium). It is quickly becoming a challenger in the PBX market (fact: we use it at dealnews), and an entire industry has sprung up around Asterisk and open-source IP telephony. For the purposes of this tutorial, I’m going to assume that you already have Asterisk installed and configured to your liking, and are now wishing to extend it beyond what it is capable of doing with the builtin dialplan applications. If this is not a good assumption in your case, may I highly suggest the Asterisk Tutorial at voip-info.org, or even better, the O’Reilly Asterisk book, which is a little dated but still quite relevant to most beginner-level stuff. Meet AGI, CGI’s hard-working cousin: AGI, or the Asterisk Gateway Interface, is the key to extending Asterisk beyond what it is capable of doing on its own. AGI gives Asterisk the ability to run and interact with scripts and programs outside of Asterisk. AGIs can be written in any language that can be executed on a Linux system (and there have been AGIs written in PHP, Python, Perl, C, Bash and just about every other language out there). Since PHP is my language of choice, that is what I’m going to concentrate on in this tutorial. Asterisk AGIs are actually incredibly simple creatures. When run from within the Asterisk dialplan, they simply send commands to Asterisk using standard output and read the results on standard input. Its what happens between those that is really, really cool. Enough Talk! Code or GTFO! So, let’s get started! First, you need to set up your script environment. I recommend doing this in an include-able file so that you can reuse it in future AGIs. There are a few commands you need to know about: <?php // This turns on implicit flushing, meaning PHP will flush the buffer after // every output call. This is necessary to make sure that AGI scripts get their // instructions to Asterisk as soon as possible, rather than buffering until // script termination. ob_implicit_flush(true); // This sets the maximum execution time for the AGI script. I usually like to // keep this set low (6 seconds), because the script should complete pretty // quickly and the last thing we want it to do is hang a call because the script // is churning. set_time_limit(6); //This sets a custom error handler function. We'll get back to this later. set_error_handler("error"); //This creates a standard in that can be used by our script. $in = fopen("php://stdin","r"); //This creates an access to standard error, for debugging. $stdlog = fopen("php://stderr", "w"); ?> Okay, that’s not too bad! Now, we’re going to do a little more advanced stuff. Every time an AGI script executes, Asterisk passes a number (about 20) values to the script. These AGI headers take the form of “key: value”, one per line separated with a line feed (\n), concluding with a blank line. Before we can do this, we need to write a few functions to read from AGI input, write to Asterisk, Execute commands, and write to the Asterisk CLI. These are the functions I use: <?php function read() { global $in, $debug, $stdlog; $input = str_replace("\n", "", fgets($in, 4096)); if ($debug){ fputs($stdlog, "read: $input\n"); } return $input; } ?> So what are we doing here? Well, the first line, we strip out the line feed in each chunk we get from stdin. Then, we check to see if $debug is set and, if so, echo what we read to standard error. Finally, we return the line we just read. Pretty simple, right? Well, this little funtion will save you lots of time. Next, we need a way to write data: <?php function write($line) { global $debug, $stdlog; if ($debug) { fputs($stdlog, "write: $line\n"); } echo $line."\n"; } ?> This function is even more simple: it just writes out to standard error if $debug is on, and outputs whatever was sent to it with an additional new line. This next function, however, is more complex. <?php function execute($command) { global $in, $out, $debug, $stdlog; write($command); $data = fgets($in, 4096); if (preg_match("/^([0-9]{1,3}) (.*)/", $data, $matches)) { if (preg_match('/^result=([0-9a-zA-Z]*)( ?\((.*)\))?$/', $matches[2], $match)) { $arr['code'] = $matches[1]; $arr['result'] = $match[1]; if (isset($match[3]) && $match[3]) { $arr['data'] = $match[3]; } if($debug) { fputs($stdlog, "CODE: " . $arr['code'] . " \n"); fputs($stdlog, "result: " . $arr['result'] . " \n"); fputs($stdlog, "result: " . $arr['data'] . " \n"); fflush($stdlog); } return $arr; } else return 0; } else return -1; } ?> Woah, complex! Well, not really. execute() is the swiss army knife of AGI programming: it allows you to do interactive stuff inside this AGI script. First, as you can see, it calls the write() function we just wrote, writing an AGI command to Asterisk. Then it looks for a response on standard in. A response from Asterisk takes the form of “result=<result> <data>”. So, we use preg_match to get this out for us and put it into something usable. We do the debug output again, then return the array or 0 or -1 in the event of failures. Just two more functions to go: <?php function verbose($str,$level=0) { $str=addslashes($str); execute("VERBOSE \"$str\" $level"); } function error($errno,$errst,$errfile,$errline) { verbose("AGI ERROR: $errfile, on line $errline: $errst"); } ?> As you can see, these two functions are very simple. One gives verbose output to the Asterisk CLI, and the other is the error function we declared using set_error_handler above. Back to reading in variables. Now that we have the ability to read in, let’s read in the default variables that are passed to the script by Asterisk. We do this using the following code chunk: <?php while ($env=read()) { $s = split(": ",$env); $key = str_replace("agi_","",$s[0]); $value = trim($s[1]); $_AGI[$key] = $value; if($debug) { verbose("Registered AGI variable $key as $value."); } if (($env == "") || ($env == "\n")) { break; } } ?> This creates an $_AGI associative array (in the spirit of $_POST, $_GET, etc) for you to use containing all the items Asterisk passed in. For each read() line, in the first line we split it to get the key and value (this could probably have been done better with a regular expression, but I got a copy of some AGI code from a friend and modified it many moons ago before I began using regular expressions). Then, we strip out the “agi_” that Asterisk adds to the key because it is superfluous, and trim out the spaces and other garbage from the value, adding them to an array. Putting It All Together: Congratuations! You now have all the tools necessary to write an AGI! I suggest (as above) putting those in an include so you can reuse as necessary. So what next? Now, you write an AGI script! Let’s start with a simple example: #!/usr/bin/php <?php include "agi.php"; execute("SAY DATETIME #"); ?> That simple! Of course, all this AGI does is read the date and time to the caller, then exit, but it just shows that AGIs can do really powerful things, really simply. “Calling” you AGI: So now you have this AGI written and you want to use it, but you don’t know how. Well this is pretty easy too! AGIs should be placed in whatever directory you define for “astagidir” in your asterisk.conf file. Unless you changed it, this will be /var/lib/asterisk/agi-bin. Next, be sure that the file is executed by setting the executable bit “chmod +x ". You may also have to fiddle with the permissions: the asterisk user or group need the ability to read and execute the script. Then, you just call it from your dialplan, like so: exten => 1000,1,AGI(<filename>)` Now, after you “extensions reload” of course, you should be able to dial 1000, and watch your AGI spring into action! A more complex example: This is an AGI I wrote at dealnews when someone in the office requested the ability to custom set names to caller IDs and have it work on all phones. Keep in mind that this is only half of the solution (the other half is a web interface). #!/usr/bin/php <?php include "agi.php" ; $db=mysql_connect('redacted', 'redacted', 'redacted'); if(!$db) {     verbose("Could not connect to DB!"); } if(!mysql_select_db('redacted', $db)) {     verbose("Could not use DB!"); } $res=mysql_query(sprintf("select substitution_name from cid_substitution where from_number='%s'",$_AGI['callerid'])); $result=mysql_fetch_row($res); if(sizeof($result)) {     execute(sprintf("SET CALLERID \"%s <%s>\"",$result[0],$_AGI['callerid'])); } mysql_close($db); ?> This demonstrates one of the main advantages to using AGIs, and PHP in particular: the ability to easily interact with databases. In this program, I’m using the caller ID supplied by the carrier to fetch a corresponding name from a database and send it back along with the call. Routing calls is accomplished by calling the EXEC function with DIAL, giving you the ability, with a little work, to route calls based on the database. Pretty neat for a language thought of only as web coding. Indeed, there is a large list of commands that AGIs can use, and variables passed into them, available here. Help! It doesn’t work! Relax! Problems happen from time to time. One of the most common faults is forgetting to set the +x bit on the file to make it executable. Permissions problems are also relatively common. For More Information: voip-info.org - a.k.a. “the wiki,” is the major information repository for Asterisk knowledge specifically, and IP telephony in general.