Handling HTML Markup with Drupal's Migrate API

In Drupal 8, we use the core Migrate API for

  • Upgrading Drupal 6 and Drupal 7 sites
  • Migrating sites from other systems to Drupal
  • Recurring imports from external systems (feeds)

It is a robust, flexible tool.

Drupal works best with structured data, and the Migrate API supports this: file attachments, related taxonomy terms, references to authors or other nodes, and so on. Along with the structured data, we also have to deal with blocks of text, and these blocks often contain HTML markup.

Until now, the Migrate API has supported basic processing of text fields using regular expressions. The speakers contributed some plugins to the Migrate Plus module to support proper HTML parsing. This is easier to use and more reliable than using regular expressions.

We originally wrote these plugins while working for Isovera on a project for Pega Systems. Both Isovera and Pega have supported sharing these plugins with the Drupal community. We hope other developers will use them and give back some of their own plugins that use the same approach.

In this session you will

  • Get an introduction to the new DOM-based plugins in Migrate Plus
  • Learn how to use the new plugins in your own migrations. (Demo time!)
  • See how to extend the framework with your own custom plugins
Audience Level
10:00 - 10:45am
Gaige Classroom 203
Profile picture for user benjifisher
Senior Developer

I am one of the maintainers of the migration subsystem (Migrate API) in Drupal core.

I moderate the weekly Drupal Usability meeting.

I choose to work with Drupal, and other open-source software, because I hate the idea of duplicated effort. When I solve a problem, I want to share my solution so that no one else has to struggle with it.