Google used to offer a SOAP API for spelling suggestion/correction but put it out of service in November 2010. Since then the only way I had found to reliably get Google’s recommended spelling suggestion for an incorrectly spelled phrase was through the same interface their toolbar browser extension uses to help correct spelling mistakes. My SpellCheck project is a PHP5 tool that asks Google it’s suggestion for a given phrase the same way their own toolbar does and based on the returned response, the original phrase will be parsed and updated to reflect the recommended changes.
The Google toolbar sends and HTTP POST request to https://www.google.com/tbproxy/spell?lang=en&hl=en originally containing an XML body as shown below (Note: appls and ornages is the phrase being queried).
<?xml version="1.0" encoding="utf-8" ?>
<spellrequest textalreadyclipped="0" ignoredups="0" ignoredigits="1" ignoreallcaps="1">
<text>appls and ornages</text>
</spellrequest>
The returned response would come as XML too, with a body something like
<?xml version="1.0" encoding="utf-8" ?>
<suggestions>
<c o="0" l="5">apples\tapple\tapps</c>
<c o="10" l="7">oranges\torange</c>
</suggestions>
Each c node contains an o attribute which is the starting point of a word to be replaced and an l attribute which is the length of the original word to be replaced. The text content of each c node is a tab-delimited list of suggestions in order of it’s potential to be what you really meant to type.
Google’s HTTP responses no longer contain o or l attributes, suggesting their toolbar does a bit more work to determine where replacements should be made based on the suggestions returned.
Rewriting an application in Ruby I needed to find a different way to get suggestions from Google. Using Nokogiri’s CSS selector support, This proved to be trivial.
require 'open-uri'
require 'nokogiri'
require 'awesome_print'
query = CGI::escape(ARGV)
doc = Nokogiri::HTML(open("http://www.google.com/search?q=#{query}"))
nodes = doc.css('#topstuff p a')
ap nodes[0].content if nodes.length > 0
(Note: You can put this in spellsuggest.rb and run ruby spell_suggest.rb "appls and ornages")
The main Google search results page is queried and parsed instead of working with the Google toolbar. The added benefit of this solution is that the suggestions seem to account for context better than the Google toolbar’s. Google’s toolbar suggestions seem to inspect and process each word independently, whereas main search page accounts for the entire phrase. A word which alone might be considered incorrectly spelled may make perfect sense in context (for example, my last name Daly might come back with suggestion Daily when the query is Jason Daly, however the Ruby solution leveraging Google’s main search page returns no suggestion).
(Note: This technique can be applied anywhere. This article is specifically related to a Symfony 1.4 implementation)
As recommended in the Google Page Speed best practices, static resources should be cached locally in the browser. This means setting a far-future expiration date on filetypes such as .css and .js content. This first step is achieved rather simply by adding something like below to the application’s .htaccess file or virtualhost’s httpd.conf include.
<IfModule mod_expires.c>
Header set cache-control: public
ExpiresActive on
ExpiresDefault "access plus 1 month"
# Other ExpiresBy... declarations here...
# css and javascript
ExpiresByType text/css "access plus 1 month"
ExpiresByType application/javascript "access plus 1 month"
ExpiresByType text/javascript "access plus 1 month"
</IfModule>
FileETag None
Note the last line above removes ETags, since according to Yahoo ETags are not needed for static content with far-future expiries set1.
The changes above will cause CSS and Javascript files to be cached locally in the browser for 1 month from the first access time, causing future requests to be made without checking against the server for newer versions of these static resources. Though the caching desired is now in place, it’s not perfect since if these static resources are modified in any way, the cached local browser version of these files will not be invalidated.
Currently in the <head> of my application, default stylesheets are added and then all included stylesheets (including those added by the specific action being requested) have their <link ... /> tags constructed and appended to the DOM using the following code
use_stylesheet('main.css');
use_stylesheet('handheld.css', '', array('media' => 'handheld'));
include_autoversioned('stylesheets');
The include_autoversioned() function is where the work is done which invalidates the browsers’ cache. This function is a wrapper of Symfony’s include_stylesheets() and include_javascripts(). I have added a Template.php file to my application’s lib/helpers/ directory which contains the function as shown below.
/**
* Depending on the type of include requested (stylesheets or javascripts), for
* each file already added to the response object, for each file, i.e. /css/base.css,
* replaces it with a string containing the file's mtime, i.e. /css/base.1221534296.css.
*
* @see http://stackoverflow.com/questions/118884/what-is-an-elegant-way-to-force-browsers-to-reload-cached-css-js-files
*
* @param $file The file to be loaded. Must be an absolute path (i.e.
* starting with slash)
*
* @throws InvalidArgumentException when invalid type is requested
*
* @return void (outputs string response directly)
*/
function include_autoversioned($type = 'stylesheets'){
if (!in_array(strtolower($type), array('stylesheets', 'javascripts'))) {
throw new InvalidArgumentException(sprintf("\$type can only be 'stylesheets' or 'javascripts': '%s' was passed", $type));
}
$function = sprintf('get_%s', $type);
$code = $function();
unset($function);
$code = preg_replace_callback('/(href|src)\=\"([^\"]+)\"/', function($matches){
if (strpos($matches[2], '/') !== 0 || !file_exists(sfConfig::get('sf_web_dir') . DIRECTORY_SEPARATOR . $matches[2])) {
$path = $matches[2];
} else {
$mtime = filemtime(sfConfig::get('sf_web_dir') . DIRECTORY_SEPARATOR . $matches[2]);
$path = preg_replace('{\\.([^./]+)$}', ".$mtime.\$1", $matches[2]);
}
return str_replace($matches[2], $path, $matches[0]);
}, $code);
echo $code;
}
(Note: The above code requires >= PHP5.3 due to the use of an anonymous function)
Also, again to the application’s .htaccess file or virtualhost’s httpd.conf include, the following must be added above the other RewriteRules that are in place for Symfony applications by default2.
# Cache-busing js and css files
RewriteRule ^(.*)\.[\d]{10}\.(css|js)$ $1.$2 [L]
To show what this does, let’s look at the <link ... /> tags generated using Symfony’s get_stylesheets in the <head>
<link rel="stylesheet" type="text/css" media="screen" href="/css/main.css" />
<link rel="stylesheet" type="text/css" media="handheld" href="/css/handheld.css" />
and compare this to the output generated by the new include_autoversioned('stylesheets')
<link rel="stylesheet" type="text/css" media="screen" href="/css/main.1282090705.css" />
<link rel="stylesheet" type="text/css" media="handheld" href="/css/handheld.1282076131.css" />
By modifying the filenames of these static resources by injecting a unix timestamp for the last-modified time of each file, whenever a change is made to one of these files, however small, it will cause the filename of that resource to change, effectively invalidating the local browser’s copy of that file. This guarantees that a browser will maintain a copy of a static resource for as long as can be reasonably expected, while always guaranteeing they immediately receive the latest changes to those resources as soon as they’re published.
The .htaccess section was in part taken from http://github.com/paulirish/html5-boilerplate ↩
Though modified considerably, the cache invalidation was inspired by this stackoverflow answer ↩
After implementing query caching using Doctrine’s Doctrine_Cache_Apc interface in a Symfony application I am working on, when running the application’s functional tests, warnings were returned intermittently in a few places
$ [apc-warning] Potential cache slam averted for key ...
In APC >= 3.0, an apc.slam_defense configuration option was added in attempt to avoid repeated writes to the same APC cache key as might occur under very high traffic. This apc.slam_defense option was later removed due to having been deprecated in favor of apc.write_lock as of APC >= 3.0.11. This is enabled by default, so the first thing I tried in my development environment was disabling this option. A simple test case will still fail though with apc.write_lock disabled.
apc_store('my_key', 1);
apc_store('my_key', 2);
echo apc_fetch('my_key'); // outputs 1, not 2
One of the more recent comments in this PECL ticket for APC offers a patch that can be applied to the latest release of APC before compiling it. This patch re-introduces the previously removed apc.slam_defense option. For my development environment only (since the cache slam warnings thrown in this environment due to my functional tests are irrelevant), I fixed the cache slam warnings by following these steps
php.ini, add the newly recognized apc.slam_defense=0 With apc.slam_defense in place and disabled, the test case above and my application’s functional tests run without any cache slam warnings.
There may be times when part or all of a text/plaintext version of an action’s template needs to be rendered for use from within another action who’s display format is set as text/html. For example, if there is a text/plaintext version of an order receipt that appears within an application, and that same version of the order receipt needs to be mailed to a user from within a separate action, we need to tell the order receipt to render in text/plaintext otherwise it will attempt to render in the format of the current action; in this case text/html. This is assuming the email is being sent as text/plaintext itself and the rendering action’s format is text/html.
Symfony offers a sfWebRequest::getPresentationFor() method which provides the functionality to render an action’s view from within another action, even if the calling action resides in a separate module. The trick is specifying the format for the other action while maintaining the requested format for the current action. The getPresentationFor() method signature is limiting in it’s support for specifying a format, so the only option appears to be as follows.
// In the current action
$format = sfContext::getInstance()->getRequest()->getRequestFormat();
sfContext::getInstance()->getRequest()->setRequestFormat('txt');
$otherActionContent = sfContext::getInstance()->getController()->getPresentationFor('moduleName', 'otherActionName');
sfContext::getInstance()->getRequest()->setRequestFormat($format);
unset($format);
The above code
Doctrine provides a stock Sluggable template behavior that can be applied to models, generating a slug based on one or more columns for a Doctrine_Record instance.
In most cases it is best practice not to modify a slug after it is generated, since one or more external sources may be referencing a URL to the Doctrine_Record instance by the original slug. If the slug is modified, those URLs referencing the original slug will become broken. For this reason, Doctrine’s default behavior is to have the canUpdate option for the Sluggable template be set to false.
There may however be reason to override this disabled canUpdate behavior under certain conditions. Doctrine 1.2 does not seem to provide an easy way to access a specific Doctrine_Record_Listener and modify an option associated with that listener for a given Doctrine_Record instance. Assuming there is only a single instance of any given listener associated with a Doctrine_Record, it’s fairly straightforward to find the desired listener within the Doctrine_Record_Listener_Chain. Applying this search conditionally to a Doctrine_Record::preUpdate() hook would look something like is shown below.
// In your Doctrine_Record class
public function preUpdate($event){
if (true) { // Conditions to check against for whether or not to allow the slug to be modified go here
try {
SomeUtilityClass::getListener($event->getInvoker()->getListener(), 'Sluggable')->setOption('canUpdate', true);
} catch (Exception $e) {
// Silently fail
}
}
}
// In SomeUtilityClass
class SomeUtilityClass {
static public function getListener(Doctrine_Record_Listener_Chain $chain, $search) {
$i = 0;
do {
if ($chain[$i] == null) {
throw new InvalidArgumentException('The requested Doctrine_Record_Listener is not associated with this Doctrine_Record instance');
}
if (get_class($chain[$i]) === sprintf('Doctrine_Template_Listener_%s', $search)) {
return $chain[$i];
}
} while (++$i);
}
}
The above getListener() method is required since it appears the only access to Doctrine_Record_Listener is via an implementation of PHP’s predefined ArrayAccess interface. Since the Doctrine_Record_Listener_Chain class only implements the set() and get() methods from the ArrayAccess interface, the only way to access the listeners is via their automatically assigned numeric keys. There is no way to directly request a specific listener by name through the Doctrine 1.2 API, thus the necessity for the do...while loop.
Symfony comes with some great i18n and l10n support. The DateHelper comes with functionality for templates to provide locale-specific date formats. There are a bunch of default formats that can be easily used as
format_date(strtotime('now'), 'P', 'en_US'); // Outputs 'Monday 7 June 2010'
format_date(strtotime('now'), 'P', 'fr_FR'); // Outputs 'Lundi 7 Juin 2010'
Some of the pre-defined patterns are constructed using special token patterns. These tokens can be passed to the 2nd parameter of the format_date() function above to generate custom formatted locale-specific date/time stamps.
format_date(strtotime('now'), 'EEE, MMMM dd, yyyy hh:mm a'); // Outputs 'Mon, June 07, 2010 12:40AM'
It doesn’t appear detailed information is available in any of the main documentation books or in the source code regarding the tokens available for date/time formatting, but I did come across a wiki page in the symfony trac.
After learning Github is now offering SVN support (nearly all of my development work is done using SVN), I decided it was time to properly version my small changes to the great gRaphael library by forking the original code with my own account. I also decided to start to maintain smaller utilities I write for personal use through git on Github as well. The first of these packages is \Deefour\SpellCheck.
Introducing ”SpellCheck v1.0”. As mentioned in the README,
SpellCheck leverages “…the XML request/response used by the Google Toolbar…” accepting “… a string to be transformed into a corrected version of itself.”
This class simply makes all corrections suggested by Google to the original string passed in. Admittedly, this is not as flexible as some will like, but for now it suits my needs and is a great start. Some small points:
Usage instructions and code can be found in the v1.0 Tag on GitHub.
To prevent user-generated content formatted with Markdown from being re-parsed by the Markdown parser every time it is called to be rendered as HTML, it is best practice to store the parsed/converted HTML version of the original Markdown text alongside the original Markdown markup. The parsing of this user-generated content can only be done as necessary using Doctrine’s preSave event listener.
First, the schema for the table can be very simple.
Message:
columns:
message:
type: text
notnull: true
raw_message:
type: text
notnull: true
The raw_message field will contain the Markdown text as entered by the user. The message field will contain the converted HTML equivalent of the user-generated Markdown syntax.
When saving the record with Doctrine, only the raw_message field needs to be set manually.
$message = new Message();
$message->setRawMessage($some_posted_form_message);
$message->save();
Finally, in the Message class (an implementation of Doctrine_Record), a preSave() event listener is added to conditionally parse the user-generated Markdown into HTML and store it to the message field.
class Message extends BaseMessage {
public function preSave($event){
if (!$this->exists() || array_key_exists('raw_message', $this->getModified())) {
$markdown = new MarkdownExtra_Parser();
// strip all HTML tags
$value = htmlentities($this->raw_message, ENT_QUOTES, 'UTF-8');
$this->message = $markdown->transform($value);
}
}
// ...
}
Now, whenever a Message is saved, if the raw_message field value has been changed, it will be re-parsed, converted to HTML and stored into the message field.
Some Conversation model might have a many-to-many relationship between itself and some Participant model through some ConversationsParticipants model. By default in Doctrine, to get all Participants for some Conversation, the magic finder for
$participants = Conversation->Participants;
would return Participant instances in the order they were created (ordered by the primary key). When a custom ordering of the list of Participants is required, the magic finder must be overloaded within the base model. For example, if the participants are to be ordered by username ASC, we would create a getParticipants() method in the Conversation model like this
public function getParticipants(){
$q = DoctrineQuery::create()
->from('User u')
->innerJoin('u.ConversationsParticipants cu')
->where('cu.conversation_id = ?', $this->id)
->orderBy('u.username ASC');
return $q->execute();
}
This is great for models which have been persisted in the database already, but becomes problematic when retrieving the Participants relation for some new Conversation which has yet to be saved into the database. Since the above overloaded getParticipants() finder is querying the database against a Conversation which does not yet exist in the database, we need to conditionally ignore this custom finder’s query and default back to the magic finder within the Doctrine_Record class.
public function getParticipants(){
if ($this->isNew()) {
$q = DoctrineQuery::create()
->from('User u')
->innerJoin('u.ConversationsParticipants cu')
->where('cu.conversation_id = ?', $this->id)
->orderBy('u.username ASC');
return $q->execute();
} else {
return parent::__get('Participants');
}
}
As of Symfony 1.3, the loadHelpers() method in the sfLoader class is deprecated.
The sfLoader::loadHelpers() method is deprecated. Please use the same method from sfApplicationConfiguration.
Though it only took 3 minutes to find the proper way to call the sfApplicationConfiguration instance, I thought I’d post it here for others.
sfContext::getInstance()->getConfiguration()->loadHelpers('Template');
Soon after updating to PHP5.3 on my local computer I found that some legacy code I maintain which can not be updated currently no longer worked. After a bit of research1, I found the best solution for now was for me to downgrade back to the latest stable version of PHP 5.2, PHP 5.2.10. I needed to reinstall a few PEAR and PECL extensions which much of the code I maintain depends on.
[~] $ sudo pecl install memcache
pecl.php.net is using a unsupported protocal - This should never happen.
install failed
[~] $ sudo pear install Image_Color
pear.php.net is using a unsupported protocal - This should never happen.
install failed
I knew my PHP installs were done within /usr, so I found the PEAR .channels/ directories with
[~] $ cd /usr
[/usr] $ sudo find . -type d -name .channels
./lib/php/.channels
./share/pear/.channels
deleted them
[/usr] $ sudo find . -type d -name .channels -exec rm -rf {} \;
and then updated the channels
[/usr] $ sudo pear update-channels
Now installing extensions from PEAR and PECL is back to normal.