The manual doesn't really make it clear that the PECL ext is only for php4. For php5, you have to use the --with-tidy option. (At least, this was the case for me with php 5.2.5 on Mac OS 10.4 - Tiger.) Tiger ships with tidylibs already installed in /usr/include but there is broken header file. The easiest way to get php to compile with Tidy on 10.4 (and 10.5 Leopard also) is to download Macports and use that to install tidy (unless you want to build tidy from src.) After installing Macports via the .dmg, just su root and do:
port install tidy
(port is installed in /opt/local/bin)
Then configure --with-tidy=/opt/local (Macports installs stuff in /opt/local).
You can also use Macports to install all sorts of other libs such as png, libmcrypt, freetype and jpeg (although libpng and libjpeg are also available as package installs.)
The specific compile error is:
In file included from /usr/include/tidy/tidy.h:70,
from /Users/mari/Downloads/php-5.2.5/ext/tidy/
tidy.c:34:
/usr/include/tidy/platform.h:515: error: duplicate 'unsigned'
/usr/include/tidy/platform.h:515: warning: useless type name in empty declaration
CLVIII. Tidy
Introduction
Tidy est une interface avec la bibliothèque Tidy HTML, pour nettoyer et manipuler les documents HTML, et les traiter sous forme de balises hiérarchisées.
Pré-requis
Pour utiliser Tidy, vous devez disposer de la bibliothèque libtidy, qui est téléchargeable sur le site de http://tidy.sourceforge.net/.
Installation
Tidy est actuellement disponible pour PHP 4.3.x et PHP 5 en tant qu'extension PECL. L'extension est disponible sur http://pecl.php.net/package/tidy.
Note : Tidy 1.0 fonctionne juste avec PHP 4.3.x, alors que Tidy 2.0 ne fonctionne que avec PHP 5.
Si PEAR est disponible sur votre système *nix vous pouvez utiliser l'installeur PEAR pour avoir l'extension tidy, avec la commande suivante : pear install tidy.
Vous pouvez aussi télécharger l'archive tar.gz et installer tidy à la main :
Les utilisateurs de windows peuvent télécharger la dll de l'extension sur http://pecl4win.php.net/ext.php/php_tidy.dll.
En PHP 5 vous avez uniquement besoin de compiler en utilisant l'option --with-tidy.
Configuration à l'exécution
Le comportement de ces fonctions est affecté par la configuration dans le fichier php.ini.
Tableau 1. Options de configuration
| Nom | Par défaut | Modifiable | Historique |
|---|---|---|---|
| tidy.default_config | "" | PHP_INI_SYSTEM | Disponible depuis PHP 5.0.0. |
| tidy.clean_output | "0" | PHP_INI_PERDIR | Disponible depuis PHP 5.0.0. |
Voici un éclaircissement sur l'utilisation des directives de configuration.
Types de ressources
Cette extension ne définit aucune ressource.
Classes pré-définies
tidyNode
Méthodes
tidyNode->hasChildren - Retourne TRUE si le noeud courant a des enfants
tidyNode->hasSiblings - Retourne TRUE si le noeud courant a des frères
tidyNode->isAsp - Retourne TRUE si le noeud courant est du ASP
tidyNode->isComment - Retourne TRUE si le noeud courant est un commentaire
tidyNode->isHtml - Retourne TRUE si le noeud courant est du code de l'HTML
tidyNode->isJste - Retourne TRUE si le noeud courant est du JSTE
tidyNode->isPhp - Retourne TRUE si le noeud courant est du PHP
tidyNode->isText - Retourne TRUE si le noeud courant est du texte (aucun marquage)
Propriétés
value - la valeur du noeud (par exemple, le texte html)
name - le nom de la balise (par exemple, html, a, etc.)
type - le type du noeud (une des constantes au-dessus, par exemple, TIDY_NODETYPE_PHP)
line* - la ligne où le noeud commence
column* - la colonne où le noeud commence
proprietary* - TRUE si le noeud réfère à une balise propriétaire
id - le ID de la balise (une des constantes au-dessus, par exemple, TIDY_TAG_FRAME)
attribute - un tableau avec les attributs du noeud courant ou NULL s'il n'y en a pas
child - un tableau avec les enfants tidyNode ou NULL s'il n'y en a pas
Note : Les propriétés marquées d'un * sont seulement disponibles à partir de PHP 5.1.0.
Constantes pré-définies
Ces constantes sont définies par cette extension, et ne sont disponibles que si cette extension a été compilée avec PHP, ou bien chargée au moment de l'exécution.
Chaque TIDY_TAG_XXX représente un tag HTML. Par exemple, TIDY_TAG_A représente le tag <a href="XX">link</a>. Chaque TIDY_ATTR_XXX représente un attribut HTML. Par exemple, TIDY_ATTR_HREF représentera l'attribut href dans l'exemple précédent.
Les constantes suivantes sont définies par cette extension :
Tableau 2. constantes des tags tidy
| constante |
|---|
| TIDY_TAG_UNKNOWN |
| TIDY_TAG_A |
| TIDY_TAG_ABBR |
| TIDY_TAG_ACRONYM |
| TIDY_TAG_ALIGN |
| TIDY_TAG_APPLET |
| TIDY_TAG_AREA |
| TIDY_TAG_B |
| TIDY_TAG_BASE |
| TIDY_TAG_BASEFONT |
| TIDY_TAG_BDO |
| TIDY_TAG_BGSOUND |
| TIDY_TAG_BIG |
| TIDY_TAG_BLINK |
| TIDY_TAG_BLOCKQUOTE |
| TIDY_TAG_BODY |
| TIDY_TAG_BR |
| TIDY_TAG_BUTTON |
| TIDY_TAG_CAPTION |
| TIDY_TAG_CENTER |
| TIDY_TAG_CITE |
| TIDY_TAG_CODE |
| TIDY_TAG_COL |
| TIDY_TAG_COLGROUP |
| TIDY_TAG_COMMENT |
| TIDY_TAG_DD |
| TIDY_TAG_DEL |
| TIDY_TAG_DFN |
| TIDY_TAG_DIR |
| TIDY_TAG_DIV |
| TIDY_TAG_DL |
| TIDY_TAG_DT |
| TIDY_TAG_EM |
| TIDY_TAG_EMBED |
| TIDY_TAG_FIELDSET |
| TIDY_TAG_FONT |
| TIDY_TAG_FORM |
| TIDY_TAG_FRAME |
| TIDY_TAG_FRAMESET |
| TIDY_TAG_H1 |
| TIDY_TAG_H2 |
| TIDY_TAG_H3 |
| TIDY_TAG_H4 |
| TIDY_TAG_H5 |
| TIDY_TAG_H6 |
| TIDY_TAG_HEAD |
| TIDY_TAG_HR |
| TIDY_TAG_HTML |
| TIDY_TAG_I |
| TIDY_TAG_IFRAME |
| TIDY_TAG_ILAYER |
| TIDY_TAG_IMG |
| TIDY_TAG_INPUT |
| TIDY_TAG_INS |
| TIDY_TAG_ISINDEX |
| TIDY_TAG_KBD |
| TIDY_TAG_KEYGEN |
| TIDY_TAG_LABEL |
| TIDY_TAG_LAYER |
| TIDY_TAG_LEGEND |
| TIDY_TAG_LI |
| TIDY_TAG_LINK |
| TIDY_TAG_LISTING |
| TIDY_TAG_MAP |
| TIDY_TAG_MARQUEE |
| TIDY_TAG_MENU |
| TIDY_TAG_META |
| TIDY_TAG_MULTICOL |
| TIDY_TAG_NOBR |
| TIDY_TAG_NOEMBED |
| TIDY_TAG_NOFRAMES |
| TIDY_TAG_NOLAYER |
| TIDY_TAG_NOSAVE |
| TIDY_TAG_NOSCRIPT |
| TIDY_TAG_OBJECT |
| TIDY_TAG_OL |
| TIDY_TAG_OPTGROUP |
| TIDY_TAG_OPTION |
| TIDY_TAG_P |
| TIDY_TAG_PARAM |
| TIDY_TAG_PLAINTEXT |
| TIDY_TAG_PRE |
| TIDY_TAG_Q |
| TIDY_TAG_RP |
| TIDY_TAG_RT |
| TIDY_TAG_RTC |
| TIDY_TAG_RUBY |
| TIDY_TAG_S |
| TIDY_TAG_SAMP |
| TIDY_TAG_SCRIPT |
| TIDY_TAG_SELECT |
| TIDY_TAG_SERVER |
| TIDY_TAG_SERVLET |
| TIDY_TAG_SMALL |
| TIDY_TAG_SPACER |
| TIDY_TAG_SPAN |
| TIDY_TAG_STRIKE |
| TIDY_TAG_STRONG |
| TIDY_TAG_STYLE |
| TIDY_TAG_SUB |
| TIDY_TAG_TABLE |
| TIDY_TAG_TBODY |
| TIDY_TAG_TD |
| TIDY_TAG_TEXTAREA |
| TIDY_TAG_TFOOT |
| TIDY_TAG_TH |
| TIDY_TAG_THEAD |
| TIDY_TAG_TITLE |
| TIDY_TAG_TR |
| TIDY_TAG_TR |
| TIDY_TAG_TT |
| TIDY_TAG_U |
| TIDY_TAG_UL |
| TIDY_TAG_VAR |
| TIDY_TAG_WBR |
| TIDY_TAG_XMP |
Tableau 3. Constantes des attributs tidy
| constante |
|---|
| TIDY_ATTR_UNKNOWN |
| TIDY_ATTR_ABBR |
| TIDY_ATTR_ACCEPT |
| TIDY_ATTR_ACCEPT_CHARSET |
| TIDY_ATTR_ACCESSKEY |
| TIDY_ATTR_ACTION |
| TIDY_ATTR_ADD_DATE |
| TIDY_ATTR_ALIGN |
| TIDY_ATTR_ALINK |
| TIDY_ATTR_ALT |
| TIDY_ATTR_ARCHIVE |
| TIDY_ATTR_AXIS |
| TIDY_ATTR_BACKGROUND |
| TIDY_ATTR_BGCOLOR |
| TIDY_ATTR_BGPROPERTIES |
| TIDY_ATTR_BORDER |
| TIDY_ATTR_BORDERCOLOR |
| TIDY_ATTR_BOTTOMMARGIN |
| TIDY_ATTR_CELLPADDING |
| TIDY_ATTR_CELLSPACING |
| TIDY_ATTR_CHAR |
| TIDY_ATTR_CHAROFF |
| TIDY_ATTR_CHARSET |
| TIDY_ATTR_CHECKED |
| TIDY_ATTR_CITE |
| TIDY_ATTR_CLASS |
| TIDY_ATTR_CLASSID |
| TIDY_ATTR_CLEAR |
| TIDY_ATTR_CODE |
| TIDY_ATTR_CODEBASE |
| TIDY_ATTR_CODETYPE |
| TIDY_ATTR_COLOR |
| TIDY_ATTR_COLS |
| TIDY_ATTR_COLSPAN |
| TIDY_ATTR_COMPACT |
| TIDY_ATTR_CONTENT |
| TIDY_ATTR_COORDS |
| TIDY_ATTR_DATA |
| TIDY_ATTR_DATAFLD |
| TIDY_ATTR_DATAPAGESIZE |
| TIDY_ATTR_DATASRC |
| TIDY_ATTR_DATETIME |
| TIDY_ATTR_DECLARE |
| TIDY_ATTR_DEFER |
| TIDY_ATTR_DIR |
| TIDY_ATTR_DISABLED |
| TIDY_ATTR_ENCODING |
| TIDY_ATTR_ENCTYPE |
| TIDY_ATTR_FACE |
| TIDY_ATTR_FOR |
| TIDY_ATTR_FRAME |
| TIDY_ATTR_FRAMEBORDER |
| TIDY_ATTR_FRAMESPACING |
| TIDY_ATTR_GRIDX |
| TIDY_ATTR_GRIDY |
| TIDY_ATTR_HEADERS |
| TIDY_ATTR_HEIGHT |
| TIDY_ATTR_HREF |
| TIDY_ATTR_HREFLANG |
| TIDY_ATTR_HSPACE |
| TIDY_ATTR_HTTP_EQUIV |
| TIDY_ATTR_ID |
| TIDY_ATTR_ISMAP |
| TIDY_ATTR_LABEL |
| TIDY_ATTR_LANG |
| TIDY_ATTR_LANGUAGE |
| TIDY_ATTR_LAST_MODIFIED |
| TIDY_ATTR_LAST_VISIT |
| TIDY_ATTR_LEFTMARGIN |
| TIDY_ATTR_LINK |
| TIDY_ATTR_LONGDESC |
| TIDY_ATTR_LOWSRC |
| TIDY_ATTR_MARGINHEIGHT |
| TIDY_ATTR_MARGINWIDTH |
| TIDY_ATTR_MAXLENGTH |
| TIDY_ATTR_MEDIA |
| TIDY_ATTR_METHOD |
| TIDY_ATTR_MULTIPLE |
| TIDY_ATTR_NAME |
| TIDY_ATTR_NOHREF |
| TIDY_ATTR_NORESIZE |
| TIDY_ATTR_NOSHADE |
| TIDY_ATTR_NOWRAP |
| TIDY_ATTR_OBJECT |
| TIDY_ATTR_OnAFTERUPDATE |
| TIDY_ATTR_OnBEFOREUNLOAD |
| TIDY_ATTR_OnBEFOREUPDATE |
| TIDY_ATTR_OnBLUR |
| TIDY_ATTR_OnCHANGE |
| TIDY_ATTR_OnCLICK |
| TIDY_ATTR_OnDATAAVAILABLE |
| TIDY_ATTR_OnDATASETCHANGED |
| TIDY_ATTR_OnDATASETCOMPLETE |
| TIDY_ATTR_OnDBLCLICK |
| TIDY_ATTR_OnERRORUPDATE |
| TIDY_ATTR_OnFOCUS |
| TIDY_ATTR_OnKEYDOWN |
| TIDY_ATTR_OnKEYPRESS |
| TIDY_ATTR_OnKEYUP |
| TIDY_ATTR_OnLOAD |
| TIDY_ATTR_OnMOUSEDOWN |
| TIDY_ATTR_OnMOUSEMOVE |
| TIDY_ATTR_OnMOUSEOUT |
| TIDY_ATTR_OnMOUSEOVER |
| TIDY_ATTR_OnMOUSEUP |
| TIDY_ATTR_OnRESET |
| TIDY_ATTR_OnROWENTER |
| TIDY_ATTR_OnROWEXIT |
| TIDY_ATTR_OnSELECT |
| TIDY_ATTR_OnSUBMIT |
| TIDY_ATTR_OnUNLOAD |
| TIDY_ATTR_PROFILE |
| TIDY_ATTR_PROMPT |
| TIDY_ATTR_RBSPAN |
| TIDY_ATTR_READONLY |
| TIDY_ATTR_REL |
| TIDY_ATTR_REV |
| TIDY_ATTR_RIGHTMARGIN |
| TIDY_ATTR_ROWS |
| TIDY_ATTR_ROWSPAN |
| TIDY_ATTR_RULES |
| TIDY_ATTR_SCHEME |
| TIDY_ATTR_SCOPE |
| TIDY_ATTR_SCROLLING |
| TIDY_ATTR_SELECTED |
| TIDY_ATTR_SHAPE |
| TIDY_ATTR_SHOWGRID |
| TIDY_ATTR_SHOWGRIDX |
| TIDY_ATTR_SHOWGRIDY |
| TIDY_ATTR_SIZE |
| TIDY_ATTR_SPAN |
| TIDY_ATTR_SRC |
| TIDY_ATTR_STANDBY |
| TIDY_ATTR_START |
| TIDY_ATTR_STYLE |
| TIDY_ATTR_SUMMARY |
| TIDY_ATTR_TABINDEX |
| TIDY_ATTR_TARGET |
| TIDY_ATTR_TEXT |
| TIDY_ATTR_TITLE |
| TIDY_ATTR_TOPMARGIN |
| TIDY_ATTR_TYPE |
| TIDY_ATTR_USEMAP |
| TIDY_ATTR_VALIGN |
| TIDY_ATTR_VALUE |
| TIDY_ATTR_VALUETYPE |
| TIDY_ATTR_VERSION |
| TIDY_ATTR_VLINK |
| TIDY_ATTR_VSPACE |
| TIDY_ATTR_WIDTH |
| TIDY_ATTR_WRAP |
| TIDY_ATTR_XML_LANG |
| TIDY_ATTR_XML_SPACE |
| TIDY_ATTR_XMLNS |
Tableau 4. Constantes de types de noeud tidy
| constante | description |
|---|---|
| TIDY_NODETYPE_ROOT | noeud racine |
| TIDY_NODETYPE_DOCTYPE | doctype |
| TIDY_NODETYPE_COMMENT | commentaire HTML |
| TIDY_NODETYPE_PROCINS | Instruction de processus |
| TIDY_NODETYPE_TEXT | Texte |
| TIDY_NODETYPE_START | début de tag |
| TIDY_NODETYPE_END | fin de tag |
| TIDY_NODETYPE_STARTEND | tag vide |
| TIDY_NODETYPE_CDATA | CDATA |
| TIDY_NODETYPE_SECTION | section XML |
| TIDY_NODETYPE_ASP | code ASP |
| TIDY_NODETYPE_JSTE | code JSTE |
| TIDY_NODETYPE_PHP | code PHP |
| TIDY_NODETYPE_XMLDECL | Déclaration XML |
Exemples
Ce simple exemple montre l'utilisation de base de Tidy.
- Table des matières
- ob_tidyhandler -- Fonction de callback ob_start pour réparer le buffer
- tidy_access_count -- Retourne le nombre d'alertes d'accessibilité Tidy rencontrées dans le document
- tidy_clean_repair -- Effectue les opérations de nettoyage et de réparation préparées pour un fichier HTML
- tidy_config_count -- Retourne le nombre d'erreurs de configuration Tidy rencontrées dans le document
- tidy::__construct -- Construit un nouvel objet Tidy
- tidy_diagnose -- Etablit le diagnostic pour le document analysé et réparé
- tidy_error_count -- Retourne le nombre d'erreurs Tidy rencontrées dans le document
- tidy_get_body -- Retourne un objet TidyNode, commencé à partir de la balise <body>
- tidy_get_config -- Lit la configuration Tidy courante
- tidy_get_error_buffer -- Retourne les alertes et erreurs qui sont survenues lors de l'analyse du document
- tidy_get_head -- Retourne un objet TidyNode à partir de la balise <head>
- tidy_get_html_ver -- Détecte le version du code HTML utilisée dans un document
- tidy_get_html -- Retourne un objet TidyNode commençant à la balise <html>
- tidy_get_opt_doc -- Retourne la documentation pour le nom de l'option donnée
- tidy_get_output -- Retourne une chaîne représentant les balises telles qu'analysées par Tidy
- tidy_get_release -- Retourne la date de publication (version) de la bibliothèque Tidy
- tidy_get_root -- Retourne un objet tidyNode représentant la racine du document HTML
- tidy_get_status -- Retourne le statut du document spécifié
- tidy_getopt -- Retourne la valeur de l'option de configuration Tidy
- tidy_is_xhtml -- Indique si le document est un document XHTML
- tidy_is_xml -- Indique si le document est un document XML générique (non HTML/XHTML)
- tidy_load_config -- Charge un fichier de configuration ASCII Tidy avec l'encodage spécifié
- tidy_node->get_attr -- Retourne la valeur de l'attribut spécifié
- tidy_node->get_nodes -- Retourne un tableau avec les noeuds placés sous le noeud courant, avec l'identifiant spécifié
- tidy_node->next -- Retourne le prochain frère du noeud courant
- tidy_node->prev -- Retourne le frère précédent de ce noeud
- tidy_parse_file -- Analyse les balises d'un fichier ou d'une URI
- tidy_parse_string -- Analyse un document HTML contenu dans une chaîne
- tidy_repair_file -- Répare un fichier et le renvoie en tant que chaîne
- tidy_repair_string -- Répare une chaîne HTML en utilisant un fichier de configuration optionnel
- tidy_reset_config -- Redonne les valeurs de configuration par défaut de Tidy
- tidy_save_config -- Sauve la configuration courante dans un fichier
- tidy_set_encoding -- Modifie le jeu de caractères pour les entrées/sorties de l'analyseur Tidy
- tidy_setopt -- Modifie la valeur de l'option de configuration Tidy
- tidy_warning_count -- Retourne le nombre d'alertes Tidy rencontrées dans le document spécifié
- tidyNode->hasChildren -- Retourne true si le noeud a des enfants
- tidyNode->hasSiblings -- Retourne true si le noeud a des frères
- tidyNode->isAsp -- Retourne TRUE si ce noeud Tidy est du code ASP
- tidyNode->isComment -- Retourne true si le noeud représente un commentaire
- tidyNode->isHtml -- Retourne true si le noeud est une partie d'un document HTML
- tidyNode->isJste -- Renvoie true; si ce node est JSTE
- tidyNode->isPhp -- Retourne TRUE si ce noeud est en PHP
- tidyNode->isText -- Retourne true si le noeud représente du texte (aucun marquage)
Valid XHTML STRICT
<?php
if (function_exists('tidy_repair_string'))
{
$xhtml = tidy_repair_string($xhtml, array('output-xhtml' => true, 'show-body-only' => true, 'doctype' => 'strict', 'drop-font-tags' => true, 'drop-proprietary-attributes' => true, 'lower-literals' => true, 'quote-ampersand' => true, 'wrap' => 0), 'raw');
}
?>
To install correctly Tidy for PHP5 on Ubuntu, follow this link :
http://ubuntuforums.org/showthread.php?t=195636
In fact, you need to run a "make clean" before the commands "make" and "make install"
i had many problem with a javascript that grab mouse event on image and tidy (obviously).
I found this solution:
'output-xhtml' => false
and everything is working again!
I have been searching for an easy way to check an entire website against HTML/XHTML formatting (no error, compilant, etc.), tidy is very useful for that :
<?php
/** aready checked pages */
$e=array();
/** webpages to check */
$t=array("/web/test.com/");
/** forbidden extensions (typically linked ressources) */
$x=explode(",","jpg,gif,png,doc,xls,pdf");
echo "<pre>";
while ($t[0]) {
// already checked or a ressource => skip
if (in_array($t[0],$e) || in_array(substr($t[0],-3),$x)) array_shift($t);
else { $c=array_shift($t); $e[]=$c; $t=array_merge($t,ck($c)); }
}
echo "</pre>";
/**
check_vailidty($url,$server)
return : list of the internal links of the page
*/
function ck($u,$s="http://127.0.0.1") {
$c=array("indent"=>1,"output-xhtml"=>1,"accessibility-check"=>3);
$t=tidy_parse_string(file_get_contents($s.$u),$c);
tidy_clean_repair($t);
if (tidy_error_count($t)) { // we have error, display them
echo "FAIL ".htmlentities($u)." (".tidy_error_count($t)." errors)\n";
echo htmlentities(tidy_get_error_buffer($t))."\n";
} else { // all right
echo "OK ".htmlentities($u)."\n";
}
// return all the links inside the page
return gl(tidy_get_root($t),substr($u,-1)=="/"?$u:dirname($u)."/");
}
/**
get_links($tinynode,$baseurl)
return : list of the links
*/
function gl($t,$b) {
$r=array();
$c=count($t->child);
for ($i=0;$i<$c;$i++) {
$e=&$t->child[$i];
if ($e->name=="a") { // a link
$h=$e->attribute["href"]; // url
if (substr($h,0,4)!="http") { // prevent external links
$r[]=sp(substr($h,0,1)=="/"?$h:$b.$h);
}
} else { // not a link, search recursively inside
$r=array_merge($r,gl($e,$b));
}
}
return $r;
}
/**
simplify_path($path)
return : simplified path
*/
function sp($p) {
while ($o!=$p) {
$o=$p;
$p=str_replace(array("//","/./"),"/",$p);
$p=preg_replace("/\/[^\/]+\/..\//","/",$p);
}
return $p;
}
?>
Limitation : does not detect javascript-generated links. Check about set_time_limit(0) if you have a lot of webpages.
To get libtidy and PHP 5.0.5 compiled on OS X Tiger this is what I needed to do:
1) download and upack the tidy source.
2) cd tidy-source-dir
3) >> /bin/sh build/gnuauto/setup.sh
4) then you can configure/make/make install as normal
PHP build generates errors because of tidy so I needed to edit the platform.h file like this (use your favorite command line editor):
5) >> sudo emacs /usr/local/include/platform.h
6) comment out line 508 which was causing the 'duplicate "unsigned" ' error in the PHP build.
7) configure/make/make install PHP as normal using --with-tidy=/usr/local
Restart apache and everything works now. HTH someone.
<?php
//
//The tidy tree of your favorite !
//For PHP 5 (CGI)
//Thanks to john@php.net
//
$file="http://www.php.net";
//
$cns=get_defined_constants(true);
$tidyCns=array("tags"=>array(),"types"=>array());
foreach($cns["tidy"] as $cKey=>$cVal){
if($cPos=strpos($cKey,$cStr="TAG")) $tidyCns["tags"][$cVal]="$cStr : ".substr($cKey,$cPos+strlen($cStr)+1);
elseif($cPos=strpos($cKey,$cStr="TYPE")) $tidyCns["types"][$cVal]="$cStr : ".substr($cKey,$cPos+strlen($cStr)+1);
}
$tidyNext=array();
//
echo "<html><head><meta http-equiv='Content-Type' content='text/html; charset=windows-1252'><title>Tidy Tree :: $file</title></head>";
echo "<body><pre>";
//
tidyTree(tidy_get_root(tidy_parse_file($file)),0);
//
function tidyTree($tidy,$level){
global $tidyCns,$tidyNext;
$tidyTab=array();
$tidyKeys=array("type","value","id","attribute");
foreach($tidy as $pKey=>$pVal){
if(in_array($pKey,$tidyKeys)) $tidyTab[array_search($pKey,$tidyKeys)]=$pVal;
}
ksort($tidyTab);
foreach($tidyTab as $pKey=>$pVal){
switch($pKey){
case 0 :
if($pVal==4) $value=true; else $value=false;
echo indent(true,$level).$tidyCns["types"][$pVal]."\n"; break;
case 1 :
if($value){
echo indent(false,$level)."VALEUR : ".str_replace("\n","\n".indent(false,$level),$pVal)."\n";
}
break;
case 2 :
echo indent(false,$level).$tidyCns["tags"][$pVal]."\n"; break;
case 3 :
if($pVal!=NULL){
echo indent(false,$level)."ATTRIBUTS : ";
foreach ($pVal as $aKey=>$aVal) echo "$aKey=$aVal "; echo "\n";
}
}
}
if($tidy->hasChildren()){
$level++; $i=0;
$tidyNext[$level]=true;
echo indent(false,$level)."\n";
foreach($tidy->child as $child){
$i++;
if($i==count($tidy->child)) $tidyNext[$level]=false;
tidyTree($child,$level);
}
}
else echo indent(false,$level)."\n";
}
//
function indent($tidyType,$level){
global $tidyNext;
$indent="";
for($i=1;$i<=$level;$i++){
if($i<$level||!$tidyType){
if($tidyNext[$i]) $str="| "; else $str=" ";
}
else $str="+--";
$indent=$indent.$str;
}
return $indent;
}
//
echo "</pre></body></html>";
//
?>
Using PHP 5.1.2 on Win32/IIS, I noticed that even with "output-xhtml: yes," tidy was adding the deprecated name attribute to form tags (using the value of the id attribute). Grabbing the latest dll from the snaps link at the top of the page fixed this.
It should be noted that the examples on this page apply ONLY to PHP5. None of the functions in the manual apply to PHP4. The names are the same but arguments are different on some of them (tidy_parse_string).
If you wish to use tidy in PHP 4.3.x you can use the following example instead:
<?php
$tidyhtml = ob_get_contents();
if( function_exists( 'tidy_parse_string' ) ) {
tidy_set_encoding('iso-8859-1');
tidy_parse_string($tidyhtml);
tidy_setopt('output-xhtml', TRUE);
tidy_setopt('indent', TRUE);
tidy_setopt('indent-spaces', 2);
tidy_setopt('wrap', 200);
tidy_clean_repair();
$tidyhtml = tidy_get_output();
}
ob_end_clean();
echo $tidyhtml;
?>
Hope that helps somebody.
To those who need to install libtidy on mac os x , here is a guide that worked for me :
If you're on Mac OS X, you'll need to tell the Makefile that you use
ranlib:
$ export set RANLIB=ranlib
Change to the directory with the Makefile in it, and run make.
This example uses the GNU make Makefile.
$ cd tidy/build/gmake/
$ make
if [ ! -d ./obj ]; then mkdir ./obj; fi
gcc -o obj/access.o ...
... etc etc etc ...
Install the libs, headers and the tidy executable:
$ sudo make install
If you're on Mac OS X, you'll have to run ranlib again on the installed
lib:
$ sudo ranlib /usr/local/lib/libtidy.a
Rough installation instructions for debian/testing:
Use debian's apt package manager to install the required development packages
$ apt-get install php4-dev php4-pear libtidy-dev
Then use pear to install tidy
$ pear install tidy
Note: I did /not/ have success installing the tarball locally. Only using this method was the .so put in the correct place.
I also had to add an entry to the php.ini
$ echo extension=tidy.so >> /etc/php4/apache/php.ini
$ apachectl restart
...and you're done.
I'm installing PHP 5.0.2 on Redhat Linux (I forget the version. Enterprise WS 3 I think) I had troubles installing the libtidy. It consistently complained that it could not find 'libtidy'. I finally got a clue into how to install it (in build/gnuauto/readme.txt). This is how I finally got it to install (after lots of trial and error):
First, don't get the binary distribution of of tidy.sf.net. It's not what you want. You need the source distribution.
Command by command this is what I did:
=======
wget http://tidy.sourceforge.net/src/tidy_src.tgz
tar -xzf tidy_src.tgz
cd tidy
/bin/sh build/gnuauto/setup.sh
./configure --prefix=/usr
make
make install
cd [php source directory]
./configure --with-tidy=/usr --[other extensions]
make
make install
=======
Tada. Finally it doesn't complain when I configure PHP about the installation. The info I needed was stuck in that build/gnuauto/readme.txt file in the tidy directory.
Took me a while. Hope my trials can help others save time.
Doodleelephant
Installing tidy on Fedora Core 2 required three libraries:
tidy...
tidy-devel...
libtidy...
All of which I found at http://rpm.pbone.net
Then, finally, could "./configure --with-tidy"
Hope this helps someone out. This was "REALLY" hard (for me) to figure out as no where else was clearly documented.
