Home

Drupal filter system updating

Jul 03, 2004

For a long time people have been complaining about the filter system in Drupal. This is the part that handles the transformation from user-supplied input into the HTML output, and takes such responsibilities like HTML tag stripping, code tags, auto-links, etc.

Like most parts of Drupal, it's very modular and pluggable. Still, it doesn't do what most people want. In fact, it lacks some features which are present in most other CMS. To address these issues I've been thinking about a major filter system upgrade for a couple of months, but I haven't had time to actually do it, until now.

The root of the problem is that Drupal only has one global filtering profile: the same settings and rules are applied to all input, regardless of who posted it or where it was posted. Administrators cannot have looser filters than anonymous visitors. In some cases (blocks, book and site pages), some customizability is available through a module-specific selector for text, HTML or PHP code, which is then only available to administrators, but this is independent of the filter system.

My solution basically consists of multiple filter profiles.

Instead of one global profile, administrators will be free to define as many profiles as they want. Each profile contains its own filter configuration: which filters are enabled, in what order and with what settings. Access to filter profiles is configurable with roles.

In addition to this, some small extra filters will be created out of current pieces of code. For example, one for evaluating PHP code. Instead of block, book and page.module each having a PHP type, the admin can simply set up a PHP filtering profile, restricted to admins, and enter content with that type in the blocks and pages to be run as PHP code.

For anonymous users, only one profile is likely to be available, and in that case nothing changes for them. Only when multiple profiles are enabled do you get a selector (dropdown or radio) below a textarea to choose the format.

Now, the idea sounds nice, but how do we implement it?

1) Filters need to be made profile-aware.
Since the filter-ordering changes in 4.4, filters have grown already from simple hooks to registered things. We simply expand the filter hook and require modules to store information per-profile. This is not a problem because most configuration is done with Drupal variables anyway: simple prefixing will work. For complex filters which have extra setting pages, the module can decide to have global settings or per-profile settings itself. For example, smileys.module will probably not require separate sets of smileys per profile: you either have smileys enabled or you don't.

2) How to store type information
Secondly, and this is the biggest problem, is where and how to store the information about which profile a particular piece of content uses. Either we provide a function to output a profile form selector and put the responsibility for using it in modules, or we simply include the selector with form_textarea, and pick a standard format for handling metadata about pieces of text (a fieldname_meta column for fieldname? change textfields into arrays with 'text' 'type' members?). I prefer the form_textarea method because it fits in with how we now handle tips about filtering below textareas.

3) Updating modules that display content
When a module has to display a piece of user-supplied text and passes it to check_output, it would also have to pass the profile used. This is all that is needed, so it keeps the hassle minimal. Checking which profile can be used and who used it is done on submission, not on viewing, so no permissions checks have to be done when filtering takes place.

4) Include a modified, profile-aware filtercache
The additional complexity would reduce performance a bit, but the increase in power would be huge. On top of that, many people agree with me that filtercache offers significant speedups for a site that uses any sort of non-trivial filtering, so I will push for inclusion of a (modified) filtercache along with the major changes.

5) Handle editboxes
A final problem is what to do with regular editboxes. Right now there is no consensus whether or not to filter them. Some people want to used HTML in them, others don't. Including a type selector for every editbox is unnecessary and nearly impossible to do from a UI point of view, so I would instead just let the admin choose one profile to be used for non-textarea content which would default to 'plain-text'.

A typical set-up of profiles would be:
- Filtered HTML: default type for regular visitors, HTML is limited to a set of allowed tags, CSS can be stripped.
- Plain-text: default type for editboxes
- Raw HTML: only for admins, performs no filtering on the HTML
- PHP: only for admins, executes the PHP code and outputs the result

Filters like Textile can either be used as the default profile or as an extra profile for those who want to give their visitors a choice. People who do not need any filtering complexity simply use the same setup as before and nothing changes for visitors. The only difference is that they, as admins, still have more control and options for filtering.

Context Aware Filters

Jul 04, 2004 Anonymous

Just to throw a spanner in the works, what about filters that actually know what they are filtering and can filter differently depending on the vocabulary and/or term that the node belongs to.

This'd be extremely useful for something such as the glossary module. It'd then be able to provide a different meaning for terms based on their context.

Just my two cents

Blurry lines

Jul 04, 2004 Steven

There was some discussion a while ago by Goba on this, but his main usage was IMO not filter-related: he wanted to insert special things into the body, such as a floating block of related articles in a series. This stuff does IMO not belong to filters and should instead be done in the theme, a block or with nodeapi. Still there could indeed be valid uses of this.

The proposed solution was to add a context parameter to the filter hook which would contain various 'context' things such as the node object, the comment object, or anything else: remember, the filters are used for more than just nodes. This is IMO important to keep in mind, and something we should stick to. The problem is that such contexts could and would contain anything people might need access to.

We could still do this, but filtercaching would get trickier. Until I see some practical implementations of useful filters which depend on contextual info, I'm going to leave it out in any case. It can still be added later.

Nice

Jul 16, 2004 Anonymous

Thanks for all the time and thought you've put behind the filter system.

An idea I'd like to throw out similar to your concept of filter profiles is a standardized parsing library for macrotags, such as [img|nid].

That has always been my goal with the macrotag module and I have been able to successfully recreate every bracketed-style macrotag in Drupal using this system. (I've even tied the execution of macrotags to user roles).

That being said, the module needs to be updated which I plan to do, but I wanted to toss out the idea of how cool it would be to have both an advanced filtering system coupled with an easy way to make simple filters.

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <b> <dd> <dl> <dt> <i> <li> <ol> <u> <ul> <img> <em> <p> <br> <span> <div> <h2> <h3> <abbr> <small> <table> <tr> <td> <strong> <acronym> <th> <blockquote>
  • Lines and paragraphs break automatically.
  • You may post code using <code>...</code> (generic) or <?php ... ?> (highlighted PHP) tags.

More information about formatting options

Recent comments

Images