Using Laravel for Advanced String Manipulation in PHP/

...

Working with Embedded Languages

Learn how to work with embedded languages using an HTML fragment parser.

We'll cover the following...

Dataflow of our system
Beginning our HTML fragments parser
Try it out

We will work to build an HTML fragment and position analyzer. Our parser will not be concerned with creating the association between HTML elements, nor will it extract the content or sections of the document between the elements. Instead, we will focus entirely on where HTML elements appear and what features they can have, such as parameters or attributes. The intended use case for this parser will be to help make decisions from the results of other parsers.

For example, let’s look at the following Blade template, which dynamically constructs an HTML tag pair from a variable:

Press + to interact

It is relatively simple for us to read through and understand the intention behind the dynamic code. However, this task is much more difficult for HTML parsers or analysis systems. For instance, if we were to ignore the invalid characters at that location within an HTML document, would the tag name become {{ or {{ $element }}? Questions like this make working HTML documents containing embedded languages particularly difficult, especially when we need to preserve our account for the embedded languages.

The system we will work through will allow us to reasonably quickly solve these problems and provide a foundation for new features. This system will be able to accept an input document containing HTML with any number of embedded languages and produce a list of parsed HTML nodes we can ask questions about to help with whatever our task is. Our HTML parser will be able to accept ranges within the document parser it should ignore.

When our specialized HTML parser encounters one of these regions, it will skip over them and continue with the source document. This is similar to our previous technique of filling a document with spaces and newlines with the vital difference being we do not remove information from the input document. These ranges will come from other parsers, such as a dedicated Blade parser, and allow our HTML parser to only concern itself with determining what is most likely to produce HTML once the embedded languages have been evaluated.

Determining the context in which a piece of embedded ...

Introduction

What Are Strings?

Fluent Strings

The Formatting Helper Methods

The Logical Helper Methods

The Construction Helper Methods

The Extraction Helper Methods

Padding Strings

String Translations and Extension

Lines and Words

Applied Techniques: Writing a Gherkin Parser

Markov Chains and Text Generation

Fixed Width Data Parsing

Splitting Strings

Applied Techniques: A Blade Directive Validator

Working with HTML

Regular Expressions

Conclusion

Appendix

Working with Embedded Languages