A Basketful Of Papayas: July 2012

Basically Iterators provide a list interface for an object. Like all interfaces they are a contract how something can be used. If you use an interface it is not relevant how it is implemented - the implementation logic is encapsulated.

It is of course relevant on the integration level. A bad implementation can impact the performance of you application. Even an good implementation may need special resources (like a database). But all this does not impact how you use it. Your code using the object with the Iterator interface stays the same.

Let's start with a simple example that outputs a list.

$elements = array(

  'line one', 'line two'

);



foreach ($lines as $key => $value) {

  echo $key.': '.$value."\n";

}

If we transfer this into an object it would like that:

class MyProjectLineOutput {

  private $_lines = NULL;



  public function __construct($lines) {

    $this->_lines = $lines;

  }



  public function __invoke() {

    foreach ($this->_lines as $key => $value) {

      echo $key.': '.$value."\n";

    }

  }

}



$output = new MyProjectLineOutput(

  array(

    'line one', 'line two'

  )

);

$output();

On the first glance that looks like a lot more work but it isn't. The code includes two tasks. Get the lines and output them. The class encapsulates the output task and makes it reusable. In this simple example that may look superfluous but think a little larger. Like output an select-field or csv.

Encapsulate file()

Still we haven't used an Iterator, but just encapsulated the output. PHP provides several default iterators and one of them is the ArrayIterator.

$output = new MyProjectLineOutput(

  new ArrayIterator(

    array(

      'line one', 'line two'

    )

  )

);

$output();

The ArrayIterator just takes an array and makes it an Iterator. It is mostly used to implement another interface - IteratorAggregate. Both the "Iterator" and the "IteratorAggregate" interfaces inherit from a common ancestor named "Traversable". You can not implement "Traversable" directly but use it to validate if an object is traversable or in other words can be used with foreach.

Now let's load the lines from a file.

class MyProjectFile implements IteratorAggregate {

  private $_file;



  public function __construct($file) {

    $this->_file = $file;

  }



  public function getIterator() {

    return new ArrayIterator(file($this->_file));

  }

}



$output = new MyProjectLineOutput(

  new MyProjectFile('sample.txt')

);

$output();

The main difference between using file directly to this is that the file() is accessed later in the process. The foreach() inside the MyProjectLineOutput::output() method calls MyProjectFile::getIterator(). Until then we can pass the instance of MyProjectFile around without loading the file into memory. Unlike a direct call to file() we don't pass the concrete data around but an information how it can be obtained.

Iterating A Text File

Implementing Iterator we can make sure that only a part of the file needs to be loaded.

class MyProjectFileUnbuffered implements Iterator {

  private $_file;

  private $_handle = NULL;

  private $_key = -1;

  private $_current = NULL;



  public function __construct($file) {

    $this->_file = $file;

  }



  public function __destruct() {

    if (is_resource($this->_handle)) { 

      fclose($this->_handle);

    }

  }



  public function rewind() {

    if (!is_resource($this->_handle)) { 

      $this->_handle = fopen($this->_file, 'r');

    } else {

      fseek($this->_handle, 0);

    }

    $this->_key = -1;

    $this->next();

  }



  public function next() {

    if ($this->_key > 0 or $this->_current !== FALSE) {

      $this->_current = fgets($this->_handle);

      $this->_key++;

    }

  }



  public function key() {

    return $this->_key;

  }



  public function current() {

    return $this->_current;

  }



  public function valid() {

    return $this->_current !== FALSE;

  }

}

This is more source then implementing it directly. Mostly because of the class and function declarations. But you have to write this only once. And it can be improved or replaced without affecting the usage.

Map Elements

Using iterators you can encapsulate mapping actions, like using array_map() on an array but only for elements that are read. Let's say you need to chop all trailing whitespaces from the lines.

Step One: Map Iterator:

class MyProjectMapIterator implements OuterIterator {



  private $_innerIterator = NULL;

  private $_callback = NULL;



  public function __construct(Iterator $innerIterator, $callback) {

    $this->_innerIterator = $innerIterator;

    $this->_callback = $callback;

  }



  public function map($current, $key) {

    return call_user_func($this->_callback, $current, $key);

  }



  public function getInnerIterator() {

    return $this->_innerIterator;

  }



  public function rewind() {

    $this->getInnerIterator()->rewind();

  }



  public function next() {

    $this->getInnerIterator()->next();

  }



  public function key() {

    return $this->getInnerIterator()->key();

  }



  public function current() {

    return $this->map(

      $this->getInnerIterator()->current(),

      $this->getInnerIterator()->key()

    );

  }



  public function valid() {

    return $this->getInnerIterator()->valid();

  }

}

OuterIterator is an interface for iterators that wraps other iterators, it is defined in the SPL. It extends the Iterator interface. The MapIterator is an iterator of that kind, so it is cleaner to implement it that way.

Step Two: Using The Map Iterator:

Using the map iterator is not unlike using array_map.

$output = new MyProjectLineOutput(

  new MyProjectMapIterator(

    new MyProjectFile('sample.txt'),

    function($current, $key) {

      return chop($current);

    }

  )

);

$output();

The file() function has an option to do this. But it is limited to exactly this task. With MapIterator you get a lot more flexibility. You could even extend the MapIterator to get reusable mappings.

Step Three: Extending the Map Iterator

class MyProjectMapIteratorUpper extends MyProjectMapIterator {



  public function __construct(Iterator $innerIterator) {

    parent::__construct(

      $innerIterator,

      function($current, $key) {

        return strToUpper($current);

      }

    );

  }

}

Filter Elements:

Here is another option for file(), that skips empty lines. This would be filter task and here is already an superclass for that in SPL.

class MyProjectFilterIteratorSkipEmptyLines extends FilterIterator {



  public function accept() {

    return trim($this->getInnerIterator()->current()) !== '';

  }

}

Conclusion

Iterators are not about writing less code at one time. But they help you to write source that is encapsulated, easy to test and reusable. Because of this over time you will end up with less code.

A while ago (well 2 years) I posted an entry about the differences between CSS selectors and Xpath expressions. An simple ".classname" in CSS is a noisy "contains(concat(' ', normalize-space(@class), ' '), ' first ')" in Xpath.

In XSLT this adds a lot of overhead. But if your XSLT processor supports EXSLT you are able to encapsulate it into a function.

I put this into a file "contains-token.xsl".



<?xml version="1.0" encoding="UTF-8"?>

<xsl:stylesheet

  version="1.0"

  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"

  xmlns:func="http://exslt.org/functions"

  extension-element-prefixes="func"

>

<func:function name="func:contains-token">

  <xsl:param name="haystack">

  <xsl:param name="needle">

  <xsl:variable name="normalizedHaystack" select="concat(' ', normalize-space($haystack), ' ')"/>

  <xsl:variable name="normalizedNeedle" select="concat(' ', normalize-space($needle), ' ')"/>

  <func:result select="$needle != '' and contains($haystack, $needle)"/>

</func:function>



</xsl:stylesheet>

You need to declare the func namespace and define it as an extension prefix. The function definition is not unlike an normal template definition. You just have to use the "func:function" and "func:result" elements. Function always need to be inside an namespace. If in doubt, just use the "func" namespace itself.

In this case I normalize both parameters first and just validate if the haystack contains the needle.

Now it is possible to import the function and use it.



...

<xsl:import href="../functions/contains-token.xsl"/>



<xsl:template match="/">

  ...

  <xsl:value-of select=".//div[func:contains-token(@class, 'classname')]"/>

  ...

</xsl:template>

...

It is still longer then the CSS version but it is a lot more readable.

A Basketful Of Papayas

Pages

2012-07-31

What Iterators Can Do For You

Encapsulate file()

Iterating A Text File

Map Elements

Step One: Map Iterator:

Step Two: Using The Map Iterator:

Step Three: Extending the Map Iterator

Filter Elements:

Conclusion

2012-07-30

Using The PHP 5.4 Webserver On Windows

A Shortcut

2012-07-14

Matching Classes In XSLT