Mobile specific URLs? They're breaking the web

Those mobile specific URLs—m.*, mobile.* and the like—they’ve got to go. There’s just too many mobile-specific URLs popping up on Twitter and similar, and since for some reason the browser detection only only pushes people from desktop to mobile, if you’re actually on a desktop browser, you end up stuck there. A poor user experience, and it doesn’t sell many ads, either. People are just going to have to do that shit right—the same URL (i.e. resource) for everyone, and content negotiation (or “responsive design”) thereafter. (Set a cookie summarising what’s in the User-Agent header, if necessary.) It’s not that hard!

Posted

State-of-the-art JS compilers: no bytecode, no intermediate representation other than source

A brief note; something I discovered recently found surprising: JavaScript compilers, which these days end up essentially compiling the same source code into machine code multiple times (the same function might sometimes take ints and sometimes take floats, for example), do it by going back to the source code and re-parsing the source. In particular, they don’t start from an intermediate representation like an AST or bytecode. This applies both the v8 and Firefox. (Actually it’s not clear that Firefox actually operates this way, but Eich says that it could.) I wonder if they have essentially two versions of the parser, one for unverified source code (better debugging information, but slower) and another for source code known to be syntactically correct? And what does this mean for Parrot?

Posted

Modifying protected properties in PHP

In a blog post, Matthew Weier O'Phinney pointed out that if class Derived extends class Base, Derived can access protected properties and methods of Base even in the case where an instance of Base is merely the argument to some method. (I expected Derived to be able to access $this->foo where $foo is protected, but I didn’t expect that Derived would be able to access $obj->foo where $obj is an argument.)

This works even if the functions are static, leading to a generic way to modify protected properties:

class Foo {

    protected $message = "jjj";

}

class Foo_Wrapper extends Foo {

    // Allows you to set both public and protected (!) properties

    static function set($obj, $property, $value) {
        $obj->{$property} = $value;
    }

    // Allows you to get both public and protected (!) properties

    static function get($obj, $property) {
        return $obj->{$property};
    }

}

$obj = new Foo();

Foo_Wrapper::set($obj, "message", "kkk");

echo Foo_Wrapper::get($obj, "message"), "\n";

Posted

The difference between GPL and Apache-licensed code for web developers: not so much, actually

For most web developers, there is no practical difference between using GPL’d code (version 2) and code with a more permissive license such as Apache or BSD. For some reason, a lot of people don’t seem to get this. (One situation in which there is difference: if you’re distributing the actual code itself—if people download your code and run it on their own servers, for example—in this case you need to make the source code available. Another is where your code is “distributed” by being embedded on a hardware device, such as a phone.)

If the code runs on your servers only, you do not need to publicly release your changes. If you’ve written the code for a customer, you need to give them the source code, but there’s no legal obligation for the customer to make the code publicly available to anyone else.  (As an example, Google has written a lot of patches to MySQL and Linux, which have been deployed onto their machines.  They haven’t had to make any of these patches publicly available—although at least in the case of MySQL, they’ve subsequently made some of these changes public.)

This is more or less a loophole in the GPL—it was written at a time when interactions with software were mostly conducted with software running on your own machine, and the loophole is the reason the AGPL exists. The AGPL has the properties many people seem to think the GPL has: in the case of AGPL-licensed code, if you make modifications and run the modified code on your server, you do need to make the modifications public. (MongoDB is perhaps the most prominent example of software licensed in this way; they have a helpful licensing page describing the circumstances under which you do and don’t have to release any changes you might make.)

A few useful entries from the FSF’s GPL 2 FAQ:

(In reading the FAQ entries, note that “distribution” is not what you’re doing if you deploy software to web servers, or your customers' web servers.  Distribution is where customers get a copy of your software to run on their servers.)

Filed under  //  gpl   licence  
Posted

Why Aspect-Oriented Programming? What's wrong with pub/sub (or hooks)?

Is there any substantive difference between Aspect-Oriented Programming and the publish/subscribe messaging pattern, where you explicitly publish a message at join points? (I consider hooks and the observer pattern to be pretty equivalent to pub/sub, too.)

To ask the question another way, is there really any point pushing for existing languages to be enhanced to support AOP when you could just use pub/sub and get nearly all the benefits? With pub/sub you avoid all the syntax weirdness, the source code transformation weirdness (if you’re doing it that way), the cultural resistance from introducing AOP to language that doesn’t already have it, the need to learn a new syntax/DSL to express join points…

The main disadvantage of hooks seems to be that you need to be able to modify existing code to use them. They’re also ugly if you want hooks absolutely everywhere, but against this they’re explicit, they can be triggered at absolutely any point, and they don’t need language support.

(Some examples: WordPress, Rails.)

Filed under  //  aop architecture  
Posted

Recommendation: "Writing Testable Code"

A few guys a Google have put together a very readable PDF book called Writing Testable Code. It covers four flaws, and explains, in admirable detail and great clarity, why they are a bad thing, and how to fix them. I don’t usually like articles like this—I find them too vague and non-specific—but this is a brilliant exception. The four flaws are:

  • Constructor does real work
  • Digging into collaborators
  • Brittle global state and singletons
  • Class does too much

(It’s not just about testable code; it’s about good code in general. Testable code is much easier to reason about than untestable code.)

Posted

spl_object_hash: what is it good for??

Is it possible to do anything useful with PHP’s spl_object_hash() function? It returns an identifier when passed an object, but:

  1. If the object is destroyed, the identifier can be re-used.
  2. If the object is changed, the identifier stays the same.

(In having these two properties, it seems as though the identifier is essentially equivalent to the address of the object in memory.) What on earth is this for?

Posted

JavaScript CDNs: for effective caching be specific about the version you want

If you use Google’s CDN, if at all possible, you really want to be specific about the version of the library you want. Otherwise, if you say you want 1.7.x, the library will need to be sent to the browser with instructions to check back every so often to see if a newer release in the 1.7 series is available, which in Google’s case is just 1 hour. On the other hand, if you are able to specify that you want Prototype 1.7.0.0—no more and no less—then the browser is able to cache the library for 1 year.

boom:~$ wget --quiet --save-headers -O - \
http://ajax.googleapis.com/ajax/libs/prototype/1.7/prototype.js | grep ^Cache\-Control
Cache-Control: public, must-revalidate, proxy-revalidate, max-age=3600
boom:~$ wget --quiet --save-headers -O - \
http://ajax.googleapis.com/ajax/libs/prototype/1.7.0.0/prototype.js | grep ^Cache\-Control
Cache-Control: public, max-age=31536000

Posted

Zend_Mail and Amazon's Kindle document converter

If you’re trying to use Zend_Mail to send email to Amazon’s free PDF-to-Kindle converter, you might like to know that for some reason two headers that Zend_Mail adds to the message—Content-Transfer-Encoding and Content-Disposition—trick Amazon’s converter into thinking that there are no attachments. (These emails are interpreted correctly by pretty much everything else as far as I can tell so I’m not sure if this is Zend_Mail’s fault, or Amazon’s, or both.)

There’s probably several ways to fix this, but I fixed it by creating a new version of Zend_Mail_Transport_Smtp:

class Zend_Mail_Transport_Smtp_Kindle extends Zend_Mail_Transport_Smtp
{
    protected function _prepareHeaders($headers)
    {
        // Remove some headers from the mail enclosure--these somehow confuse
        // Amazon's Kindle converter, leading it to conclude that there are
        // no attachments.  (Even though every other mail client seems to be
        // able to cope.)

        if (array_key_exists("Content-Transfer-Encoding", $headers)) {
            unset($headers["Content-Transfer-Encoding"]);
        }
        if (array_key_exists("Content-Disposition", $headers)) {
            unset($headers["Content-Disposition"]);
        }

        return parent::_prepareHeaders($headers);
    }
}

Anyway, with this “bug” worked around, I can now do this:

OS X print dialog, with "Send to Kindle"
option

My ~/Library/PDF Services/Send to Kindle reads:

#!/bin/bash

TITLE=$(perl -MURI::Escape -e 'print uri_escape(join(" ", @ARGV))' $1)

cat "$3" | curl --data-binary @- "http://beebo.org/api/kindle/?username=ithinkihaveacat&title=$TITLE"

(See “Providing PDF Workflow Options in the Print Dialog” for more information.)

Posted

Understanding the Y Combinator by writing about it

Motivation

This post is written for those who, like me, have tried and failed to understand the Y Combinator despite the help of Wikipedia, blog posts constructing the Y Combinator in multiple languages, stories on Hacker News, comments on Hacker News… I think I mostly get it now (this comes courtesy of a night in Helsinki where I was up at a strange time due to the fallout from a Helsinki-Stockholm-Helsinki booze cruise)—and so perhaps these notes will help someone else. If nothing else, writing them up helped me! (Maybe the only way to understand it is to write a blog post about it.)

I think a large part of the problem understanding the Y Combinator is that the various bits of information available online are written from several different perspectives, which means that they emphasise different aspects of the Y Combinator story, to wit:

  • The properties of the Y Combinator function itself. Technically, the Y Combinator is one example of a fixed point combinator, meaning that it’s a higher order function with a particular property (to be described later). Sadly it seems this property is very hard to grok unless you’ve recently spent some time with lambda calculus (or at least it was for me), so discussion is usually motivated by one particular application of the Y Combinator, which is that…
  • The Y Combinator allows you to write “recursive” functions without explicit recursion. (“Without explicit recursion” means “recursion” without explicitly calling a function named foo in the body of a function that itself is called foo.) Thus, discussion of the Y Combinator is tied up with anonymous (i.e. unnamed) functions. (Note that the Y Combinator-powered version of a function will almost certainly not be any faster than the recursive version, and it’s going to need just as much stack space.) Finally, Y Combinator posts and articles often have an extended section on…
  • The implementation of the Y Combinator in some language. The Y Combinator is an important part of (theoretical) computer science, and its implementation typically stretches a language’s support for first class functions—it’s often instructive to compare how elegantly (or not) it can be implemented in various languages. (A few examples: JavaScript, Perl, PHP.)

Deriving the Y Combinator

Step 1

As I mentioned above, discussion of the Y Combinator is typically motivated by a particular problem: how to write “recursive” functions without explicit recursion. I don’t know any better way of going about it, so let’s first consider that standard exemplar of recursion, the factorial function, and transform it into a version that uses the Y Combinator. The standard implementation:

var fact = function (n) {
    if (n < 2) return 1;
    return n * fact(n - 1);
};

(I’m using JavaScript in these examples because its widely known and has good support for first class functions.)

Step 2

How might you go about removing the self-referential call to fact() within the function body? (The exact reasons for wanting to do this are not important, but for example suppose that you want to inline the entire call.) One obvious way is to use a keyword provided by the language that refers to the function itself such as JavaScript’s arguments.callee, but, well, this is cheating and we can do better!

Step 3

Our first attempt is to transform the fact() function slightly, so that fact becomes an argument to a wrapper function, instead of a free variable referring to the factorial function. We can then use this argument in the “recursive” call:

var fact_wrapper1 = function (fact) {
    return function (n) {
        if (n < 2) return 1;
        return n * fact(n - 1);
    };
};

At this point, the function contains no references to variables declared outside the scope of the function itself—though overall we haven’t much progress because we also don’t have a working fact() function… If only there were some way to call fact_wrapper1() passing it its own return value as an argument! If we could get this to work, the return value of fact_wrapper1() would be the factorial function we want.

Step 4

Okay, so that attempt didn’t quite work out; let’s try a slightly different approach. Perhaps it’s possible to modify fact_wrapper1() so that it can be passed as an argument to itself. (This currently isn’t possible because fact_wrapper1()’s argument (fact) is a simple function that takes an integer argument and returns an integer; by contrast fact_wrapper1() itself is a higher-order function that receives a function as an argument, returning a simple function that takes an integer argument and returning an integer.) If we can pass the function to itself, we eliminate the need to figure out how to extract just its return value. fact_wrapper1() can be transformed into such a function as follows:

var fact_wrapper2 = function (f) {
    return function (n) {
        if (n < 2) return 1;
        return n * f(f)(n - 1);
    };
};

Note that the only real change is that f(f)(n - 1) replaces fact(n - 1). This is because f(f) returns a function, which we then apply to n - 1.

Step 5

Let’s just check to see if fact_wrapper2 can really be used as an argument to itself:

> var fact = fact_wrapper2(fact_wrapper2);
> fact(5);
120

Okay, that works: so far, so good! (You can copy-and-paste all these examples into a NodeJS session.)

We can also make fact_wrapper2() look a bit more like fact_wrapper1() by moving the weird f(f) call into a helper function:

var fact_wrapper2 = function (f) {
    var g = function (n) {
        return f(f)(n);
    };
    return function (n) {
        if (n < 2) return 1;
        return n * g(n - 1);
    };
};

Step 6

By adding another helper function, we can avoid the need for fact_wrapper2 to appear twice, and if we inline its definition, we can avoid the need to give the wrapper function a name at all:

var recur = function (f) {
    return f(f);
};

var fact = recur(function (f) {
    var g = function (n) {
        return f(f)(n);
    };
    return function (n) {
        if (n < 2) return 1;
        return n * g(n - 1);
    };
});

We still get the same result:

> fact(5);
120

Step 7

Next, let’s transform Step 6 by eliminating the local variable g, instead passing it in as an argument of a new inlined function:

var recur = function (f) {
    return f(f);
};

var fact = recur(function (f) {
    return (function (g) {
        return function (n) { // unchanged from above
            if (n < 2) return 1;
            return n * g(n - 1);
        }
    })(function (n) {
        return f(f)(n);
    });
});

Step 8

This lets us move the factorial function itself out of the helper code:

var fact_core = function (g) {
    return function(n) {
        if (n < 2) return 1;
        return n * g(n - 1);
    }
};

var recur = function (f) {
    return f(f);
};

var fact = recur(function (f) {
    return fact_core(function (n) {
        return f(f)(n);
    });
});

Step 9

Almost there! At this point we can also inline recur() itself:

var fact_core = function (g) { // same as above
    return function(n) {
        if (n < 2) return 1;
        return n * g(n - 1);
    }
};

var fact = (function (g) { // the inlined recur() function
    return g(g);
})(function (f) { // same as above; now an argument to the inlined recur()
    return fact_core(function (n) {
        return f(f)(n);
    });
});

Step 10

Finally, we can turn the helper code into a standalone function (i.e. eliminating the reference to fact_core in the body of Step 9’s fact by converting it into an argument), which we’ll call Y():

var Y = function (h) {
    return (function (f) {
        return f(f);
    })(function (f) {
        return h(function (n) {
            return f(f)(n);
        });
    });
};

var fact_core = function (g) { // same as above
    return function(n) {
        if (n < 2) return 1;
        return n * g(n - 1);
    }
};

var fact = Y(fact_core);

Hurrah! We have finally achieved Y Combinator:

> fact(5)
120

Notes and Observations

Neither the function Y() nor fact_core() are self-referential—these names do not appear anywhere in the code. A lot of other functions are involved, but since they’re all anonymous, they can’t refer to themselves either. Hence, the Y Combinator can be used to implement self-referential (i.e. recursive) programs in languages that don’t otherwise support it.

The function fact() is technically the “fixed point” of the function fact_core() and the Y Combinator is a function for finding such fixed points. The fixed point p of a function f is a function that satisfies the condition f(p) = p. What this actually means is somewhat hard to get one’s head around (see the Wikipedia page for more), but it’s easy to check that fact_core() and fact() do have these properties:

> fact_core(fact)(5);
120
> fact(5)
120

This means that the following expressions are all (mathematically) equivalent to the function fact:

  • Y(fact_core)
  • fact_core(Y(fact_core))
  • fact_core(fact)
  • fact_core(fact_core(fact_core(Y(fact_core))))

References

Filed under  //  functionalprogramming   javascript   ycombinator  
Posted