...

/

Implementing the Strings Cursor

Implementing the Strings Cursor

Learn how to create a string cursor with the help of UTF8 string iterator.

We'll cover the following...

Starting our traverse method implementation

Our abstract method should look familiar. Its signature contains the three values we have been using so far to make decisions about our text input when iterating character-by-character. Our abstract cursor will call this method at each position of the string and use its return value to decide whether or not to continue. We will now begin work on our traverse method:

<?php
abstract class AbstractStringCursor
{
// ...
protected $initialCharacter;
protected $position = 0;
protected $index = 0;
public function traverse()
{
$result = new CursorResult();
$lastSize = $this->iterator->lastCharSize();
$startPosition = $this->iterator->key();
$this->position = $startPosition;
$result->startPosition = $startPosition;
$this->initialCharacter = $this->iterator->current();
// ...
return $result;
}
// ...
}

The implementation of our traverse method begins in the code above. We have defined some additional internal properties that will help maintain the state as our cursor traverses a string. As a convenience, we keep track of the initial character, allowing us to easily compare subsequent characters to it without figuring this out in all cursor implementations that require this information.

The $position and $index values will contain similar information but serve different purposes. The $position will keep track of the current character position within the larger string, and the $index will keep track of the cursor’s relative position. For example, our output began at position 10 within the larger input. However, the $index would have been 0 at this point. This is because the $index counts the number of characters the cursor has seen.

Lines 16 and 29 create an instance of CursorResult and return it, respectively. Our first bit of interesting code starts to appear on lines 18 and 19.

We are storing the number of bytes that appeared in the iterator’s last character and its position within the input text. We will use these later when we ...